1 Introduction

In recent years, autonomous driving has attracted the interest of many stakeholders, such as car manufacturers, tech companies and policy makers, as it can significantly improve road safety and energy efficiency in transportation. Autonomous vehicles face multiple technical challenges that can be broadly categorized into five main areas [1]: (1) route generation, (2) perception, (3) decision making, (4) local path planning and (5) control. Route generation is responsible of finding a global route between origin and destination, identifying the waypoints needed to travel between these two objectives from a pre-known digital map. Perception deals with accurate localization and dynamic scene understanding, detecting all agents in a scene and predicting their motion. The decision making process deals with high-level planification of driving manouvers, such as performing an overtake or giving way. Local path planning receives information from all other previous subsystems to decide a short-term route according to some optimization criteria. Finally, the controller performs the actions needed so that the car accurately follows the trajectory, be it via steering (lateral control) or throttling and braking (longitudinal control). These concepts are further explained and detailed in the works of Paden [2] and Milanes [3].

One of the most successful techniques in path planning and control systems is based on the model predictive control (MPC) framework, which evaluates both optimal vehicle trajectory and control actions over a finite time-horizon. However, effective MPC systems are based on constrained optimization processes, which involve high computational costs, rendering them mostly unable to perform as real-time systems. Moreover, real-world tuning of parameters is often unfeasible, due to the hight number of different values that dictate the behaviour of the system.

This makes the MPC design process benefit greatly from the use of simulation environments. Autonomous driving simulators are playing an increasingly important role in the development and testing of systems in autonomous vehicles, since they allow massive automated validation and testing in any type of scenario, as well as parameter refinement without physical or real-time constraints. Nevertheless, this powerful methodology is only applicable when the discrepancy between simulated and real environments is reasonably low and is thoroughly narrowed. This discrepancy is usually referred to as sim-to-real gap (simulation- to-reality) or reality gap.

One approach to ensure the reality gap is as small as possible resides in the use of realistic driving simulators, suites that show a high defree of similarity in terms of the physical beahvior of the vehicle under different weather conditions, road rugosity and friction, road profile, and several other factors, which are a fundamental part in the development and testing of innovative algorithms of various kinds applied to autonomous driving. Tackling a development process by using full-environment simulation as a tool that shortens the development cycle simplifies validation and increases efficiency, given that the reality gap is within critical limits.

1.1 Motivation and problem statement

Figure 1a shows the life cycle through which the development and research of a new system or function. The life cycle is exemplified by the analysis, design, development and testing/validation of a path planning and control system for an autonomous vehicle. A clear benefit of the acknowledgment of the reality gap and its mitigation arises in the capability of testing a given system within its simulation environment of development, since the sources of error between the simulation and is projected behaviour in the real world are well known and its values are narrowed down into a known interval with a specific trust value.

Fig. 1
figure 1

Life cycle of the development of a system. a Development, testing and validation in real environments; b development and testing in simulated environments and validation in real environments

Modern and domain-specific simulators cover various aspects of the development process, such as scenario generation, data gathering, realistic physics laws, natural traffic flow, and photo-realistic computer graphics. Both automotive and Big tech companies invested in self-driving vehicles and have widely adapted simulation environments to test and validate their systems. However, this is only feasible if it can be justified that the implemented system behaves similarly in both environments, demonstrating a sufficiently small reality gap, so that the simulated validation can be used to qualify the system as viable. Therefore, quantifying the simulation and real-world transferability is an essential step for developing autonomous driving algorithms and systems in simulation.

Waymo has accumulated the most virtual mileage of all companies investing in self-driving technology, with a total of eleven years and fifteen million simulated miles in February 2021 [4]. Their simulator, Carcraft, transforms real-world scenarios into virtual formats and runs 25,000 virtual cars simultaneously. The massive data flow from this process assists engineers in locating bugs and adjusting models efficiently. Apart from Carcraft, Apollo (Baidu) [5], XVIZ (Uber), AVS (GM Cruise), VIRTTEX (Ford), Nvidia Driveworks are some of the premium driving simulators used by the self-driving vehicle industry. However, this simulation suites are often proprietary, and require a high fee to be deployed in institutions other than the ones they were developed. Academia relies on a number of open-source or otherwise readily available alternatives, such as CARLA [6], Microsoft AirSim [7], VisSim, CarSim [8] and Gazebo [9].

However, as far as the authors are aware, there has been little to no work performed in quantifying correlation between simulation results and different real world scenarios. There have been a number of recent studies which review simulators and their performance when benchmarked with open real-world datasets revelant to the development of autonomous vehicles, such as [10], but to the best of our knowledge the reality gap remains unexplored as a means to further empower the design process inside simulation suites.

The aim of this work is to model sim-to-real transferability in a manner that allows the development of a systematic and efficient process for designing, testing, verifying and validating a steering controller for an autonomous vehicle. What this framework hopes to accomplish is to eliminate the need of real-world prototypes up until the very last step of design, which is projected to be the evaluation of comfort and perceived safety by the occupants of the vehicle. Our solution was tested on a MPC with artificial potential fields, which is the controller employed in both the simulation environment and the prototype to test our approach.

The reality gap for autonomous driving can be analyzed from different perspectives depending on the specific task. In our case, we showcase a statistical approach to quantifying the reality gap in the context of path planning. The analysis of acquired data both in the DRIVERTIVE platform [11] and in the simulation environment allows for both parameter optimization and to spot poorly handled driving scenarios that need to be specifically recognised and included within the design aspect of the controller. This leads to a more efficient, elegant and effective solution for path planning and vehicle control, the aspects of autonomous driving that this controller is in charge of handling.

The authors note that, while this work has been performed in a specicic simulation suite -namely, CARLA-, the framework presented relies only on data acquisition and statistical analysis, which makes it transferable to any an all other simulators relevant to the autonomous driving research scene. The methodology described in this paper is exemplified with the use of a given MPC controller and an already established platform, but the statistical framework holds true for any given implementation of a simulated solution or a prototyped one.

1.2 Paper structure

The contents of this paper are structured as follows: Section 1 conveys an introduction in which the scope of the problem is introduce. Section 2 includes a discussion on the state of the art concerning reality gap and its assessment, along with a discussion on predictive control and its application in autonomous driving and a concise and clear statement on what the main contribution of this work aspires to be. Section 3 is composed of several subsections in which the entirety of the framework of our investigation is based upon, namely the autonomous driving platform, the model of motion and its application in the MPC controller, the design characteristics of the adaptive potential field subsystem, and a definition of the statistical analysis that will be carried out on the acquired data streams. Section 4 presents the results of our experiments both in terms of controller performance and perceived safety, as well as a comparison against an existing MPC controller. Finally, Section 5 includes the conclusions that can be derived from this work, as well as the future lines of work that arise from it.

2 State of the art

2.1 Reality gap in autonomous driving control systems

The analysis of the reality gap in control systems has been carried out in depth in the field of robotics. Wenshuai Zhao et al. present in [12] a survey on sim-to-real transfer techniques applied in robotics by developing control systems in simulated robotic systems based on reinforcement learning. They also discuss the need to evaluate the reality gap between simulated and real behaviour. Salvato et al. [13] conducted a recent survey about closing the reality gap and reduce the sim-to-real transferability when developing a robot controller based on reinforcement learning. The reality gap was identified and the implications of its closure and development states were discussed. However, as identified by Valassakis et al. [14] when analyzing how to reach a zero sim-to-real transfer in complex dynamics systems, the correlation between simulation and real-world results is rarely quantified, mainly due to the difficulty and risks inherent to testing in real environments.

Regarding autonomous vehicles, the authors have not found a solid line of research on how to evaluate and quantify the reality gap derived from using simulation as a tool for designing real-world control systems (e.g., trajectory tracking). Some preliminary work attempted to address sim-to-real transferability in highly simplified simulation environments with a loose resemblance to real driving scenarios, as in the case of miniature cars [15]. The same type of simplified environments has recently been proposed to quantify the reality gap in autonomous driving (i.e., localization) in terms of accuracy, behaviour and failures [16]. However, the conclusions that can be drawn from these simple environments are not directly applicable to real autonomous driving environments.

2.2 Predictive control for autonomous driving

Recently, several attempts have been carried out with the goal of employing MPC in the realm of autonomous driving, typically in path planning, trajectory tracking, obstacle avoidance, or overtaking manoeuvres. In [17], a high-speed obstacle avoidance system was developed based on MPC, manoeuvring in the shortest time achievable and under dynamic safety limits. The obstacle avoidance would be executed when the vehicle does not have enough time to stop. This work considered a four-wheel dynamic model involving lateral tire forces and slip angle, where lateral tire forces are defined as non-linear. The high number of constraints defined to fulfil the optimization process renders excessive computation time and a not properly convergent process. Thus, p-norm aggregation techniques were applied to improve both the computational cost and the optimization result. In [18] a linear-time-varying MPC (LTV-MPC) was integrated as a slip angle controller, to stabilize a vehicle during a sudden lane change or or when entering a curve with excessive speed. The MPC controller was designed using a three degree of freedom non-linear vehicle model to fulfil the dynamic constraints such as slip angle or lateral tire forces. However, even though both examples provide interesting and academically meaningful discussion, the results and conclusions are entirely based on simulation environments.

In two examples of obstacle avoidance based on artificial potential fields [19, 20], a two stage system involving trajectory planning and trajectory tracking was used. Trajectory planning was implemented with artificial potential fields considering surrounding vehicles, road boundaries, and lane centers. In the first controller [19], several possible safe trajectories are generated. One of them is selected based on the state of the potential field of the surrounding environment, solving an optimization problem that generates a sub-optimal solution to the problem. An MPC controller then follows the sub-optimal selected trajectory in the later stage. In the second controller [20], the optimization problem is designed to move over the minimum potential filed values, so the path-planning and control tasks are performed in the same stage. Both works showcase splendid results in simulation settings, but do not discuss an implementation in a real prototype to assess its performance in a real world environment. A hybrid planning approach based on MPC and parametric curves for overtaking manoeuvers has also been presented in [21]. In this work, the passenger comfort associated with the smoothness of Bézier curves was combined with the reliable capabilities of MPC to react to unexpected conditions, such as lane obstacles, overtaking maneuvers and lane changes.

A hybrid trajectory planning approach for roundabout merging scenarios has been demonstrated in [22]. Roundabout merging is solved with a nominal trajectory generated through Bézier curves combined with an MPC to ensure safe future states. The computation time is reduced without sacrificing the performance of both techniques while providing safe execution of the manoeuvre. The approach has been tested in a real-world setting. However, no reality gap indicator is evaluated for improving development scalability.

2.3 Contributions

The main contribution of this work is to provide a solution to the trajectory tracking problem in real world environments using a simulation based approach. Our hope is to circunvent some of the risks and difficulties involved in carrying out the desing with a real prototype, such as road hazards, glitches and high costs.

To achieve this goal, we showcase a MPC steering controller based in adaptive potential fields that has been designed to be a reliable system both via including physical constraints in the optimization stage that deal with saturation in the input signal by enforcing a steering and steering speed limit, and by optimizing the code to run as fast as possible, so that the system can be guaranteed to operate within a specified time interval. We have developed a simulation environment that replicates a real trajectory as closely as we could manage, in hopes of reducing the variability of behaviours between the simulator and our real prototype. By statistically analising data acquired in both settings, we are able to determine if the behaviour of the simulation closely ressembles its counterpart, which enables a full software approach on the desing of the controller, thus supressing the need of testing in the real platform until the later stages of deployment. Our testing setting of choice includes well-differentiated scenarios and road structures, such as straights, curves and roundabouts, with a variety of curvature radii, which ensures the adaptability of the controller and makes for a complete suite of situations in which to test our approach.

A lesser contribution of our approach resides in the spatial distribution of the prediction horizon in the MPC controller. A difference in performance was noticed when the spatial horizon of the controller was distributed other than linearly, which will be assessed in terms of trajectory tracking.

To the authors’ knowledge, this is the first study dealing with sim-to-real transfer and reality gap quantification of autonomous vehicle control task that includes results with a real automated vehicle in real traffic conditions, including different types of intersections, curves and roundabouts.

3 Framework

The paper aims to evaluate the reality gap of a steering control system of an autonomous vehicle due to sim-to-real transfer done in design and implementation, with a methodology that compares the controller’s performance in a real environment and its virtual equivalent. The driving simulator used in this work does not include a Citröen C4 vehicle in its collection; so a vehicle with similar characteristics (Toyota Prius) was used, parameterizing its localization device with a precision of 2 cm, such as the real car counterpart. The real and simulated car geometry was based on the center of gravity and is further described in Section 3.2. Standard road features were used in both real and simulated scenarios. The vector [xw,yw,v,ψ]T defines the state vector of the system, where xw is the Easting, yw is the Northing, v longitudinal velocity of the vehicle and ψ heading of the vehicle. The state vector was sampled at 100ms in the real vehicle, and the simulator was configured to provide the same rate.

Please note that the work does not aim to reduce the reality gap understudy, improving the sim-to-real transfer with modifications to the simulator environment or changing the MCP movement model. Instead, the work formally evaluates the behaviour differences a posteriori to quantify whether the proposed simulator and design are, in general, valid for developing real-world trajectory tracking systems for autonomous vehicles. For example, the procedure may be helpful if the reality gap assessment is below a threshold set by experts. Thus, in the proposed scenario, an MPC steering controller bases on artificial-potential fields, described in Sections 3.3 and 3.4, was implemented with an interactive process under simulation and then compared the behaviour in the two environments, simulation and real. The comparison is made in terms of lateral error and controls action variation, as is described in Section 4.

3.1 Autonomous driving platform

The autonomous driving prototype is a commercial Citröen C4 with no low-level access to any of the actuators. A complete rundown of the automation systems and their characteristics can be found in [11].

The vehicle is equipped with a MPU 9255 Inertial Mass Unit (IMU) and a BMP180 altimeter, rendering a 10 degrees of freedom sensing system. In our control process, only the accelerometer and gyroscope of the IMU are evaluated. Car odometry is obtained through the Control Area Network (CAN) bus, and is read by the computer that implements the control algorithm as a state input. Position feedback is provided by a kinematic differential GPS, combined with the odometry of the vehicle by an Extended Kalman Filter (EKF). A DC motor with an EPOS driver is in charge of actuating over the steering wheel and positions it at the angle determined by our control system.

System state, i.e. vehicle variables, are used as input of the control architecture. While the hardware installed in the car permits acquiring plenty of information about its full state, we have limited the control inputs to those than can be faithfully recreated within the simulator, as to enable comparisons between it and the real world.

The simulation suite that powers this work is based on CARLA (CAR Learning to Act). Our workflow within the simulator is based on control development. A real world route is chosen as a path to benchmark our controller. which is then captured with software tools and follows an ingest process in order to adapt real world data from Geographical Information Systems (GIS) sourced from OpenStreetMap to a set of waypoints that can be recreated within the simulator. This process, paired with a faithful model of the environment of the path allows us to construct maps that accurately represent the conditions in which the car will be found when real world tests are performed.

This software structure allows for monitoring both the state of the car and its behavior in simulation in real time from within the vehicle itself. This feature has been specifically implemented to provide a real-time comparison between the simluation and its real counterpart, and allows for quick assessment of strange dynamics and sources of error.

All tests have been performed on an Intel i7-7700K platform, equipped with 32 GB of RAM and a SATA SSD, and under the Ubuntu operating system. This hardware configuration has remained constant through both simulation and real tests, and has been proven to comply with the real-time requirements of the control system.

3.2 Motion model

The behaviour of the MPC heavily relies on the intrinsic model used. In this work, the well-known kinematic bicycle model is used, where the position, heading, and velocity variables represent the system state vector [xw(t), yw(t),ψ(t),V (t)] and the front tire angle variable is the control action δ(t). The model assumes that the vehicle is symmetric and has no motion on the z-axis.

The model is based on estimating the direction of the velocity vector, β(t), at the simplified vehicle centre of gravity concerning the vehicle’s longitudinal axis. This angle is a function of the vehicle length lf + lr and the tire wheel angle δ(t). Vehicle yaw ϕ(t) is evaluated with the projection on the longitudinal axis of the velocity vector \(V \cos \limits (\beta (t))\) and the tangent of the tire wheel angle \(\tan (\delta (t))\). The heading ψ(t) is evaluated with the accumulated ϕ previously evaluated, as expressed in (1a).e. Finally, the decomposition of the velocity vector V dt concerning the world reference frame allows the evaluation of [xw(t),yw(t)] (car position in the world reference), as shown in (1a).(c-d).

$$ \begin{array}{@{}rcl@{}} \beta(t) &=& \tan^{-1} \left( \frac{l_{f} \tan(\delta(t))}{l_{f} + l_{r}} \right) \end{array} $$
(1a)
$$ \begin{array}{@{}rcl@{}} \phi(t) &=& \frac{v dt \cos(\beta(t))}{l_{f} + l_{r}} \tan(\delta(t)) \end{array} $$
(1b)
$$ \begin{array}{@{}rcl@{}} x_{w}(t) &=& v dt \cos(\psi(t) + \beta(t)) + x_{w}(t-1) \end{array} $$
(1c)
$$ \begin{array}{@{}rcl@{}} y_{w}(t) &=& v dt \sin(\psi(t) + \beta(t)) + y_{w}(t-1) \end{array} $$
(1d)
$$ \begin{array}{@{}rcl@{}} \psi(t) &=& \phi(t) + \psi(t-1) \end{array} $$
(1e)

where β is the velocity vector direction relative to the longitudinal axis, ϕ represents the vehicle’s yaw angle along its longitudinal axis, ψ is the heading angle, xw and yw are the Xw and Yw components in the global reference system and δ is the tire angle.

Figure 2 illustrates the model behaviour, which is executed recursively to determine the predicted state of the vehicle over the number of tentatives defined by the MPC controller.

Fig. 2
figure 2

Motion model included in the MPC controller

Table 1 defines the parameters used on the model for simulation and real conditions. Simulated parameter values are obtained from CARLA simulator databases, and real parameters are measured in the Citroën-C4 real car.

Table 1 Model parameters on simulated and real ego-vehicle

3.3 Artificial potential field design

Potential fields are widely used to develop path planning and trajectory tracking systems. The algorithms that incorporate this technique base their operation on moving through the minimum of the artificial potential field to achieve optimal planning results. This work integrates potential fields in the cost function of the MPC controller in such a way that the control actions of the prediction horizon move the vehicle through the minimum value of the field. This case is framed within the trajectory tracking paradigm, defining a potential field of Gaussian symmetry. Its minimum value coincides with the road map reference, corresponding with the center of the lane, preventing the vehicle from moving laterally out of the lane. Equation (2) represents the potential field used, where the transversal standard deviation σy is a linear function of vehicle speed, as seen in (3). This design makes the field narrow with increasing vehicle speed to improve the convergence of the optimization algorithm, reducing the control actions’ increments.

$$ Pf_{lane} = -a_{lane} e^{-\left (\frac{(y_{tentative} - y_{ref})^{2}}{2\sigma_{y}(v)^{2}} \right )} $$
(2)
$$ \sigma_{y}(v) \text{[m]} = \left\{\begin{array}{ll} 4 \text{if} v < 30 m s^{-1}\\ \quad5.8 - 0.06 v \\ 1 \text{if} v > 80 m s^{-1} \end{array}\right. $$
(3)

The position and orientation of each Gaussian curve are defined on the road map, a set of points equidistant at 10 cm representing the circulation path. Through testing, it was found that a value that yielded good results was alane = 2. The increment (ytentativeyref) is oriented with the heading of the reference. Thus, the proposed solution corresponds to a Gaussian weighting process of the lateral error. The image Fig. 3 represents the potential field assigned to the testing roadmap at a speed of 30km h− 1

Fig. 3
figure 3

Potential field of real and simulated scenario. Roundabout detailed

3.4 MPC formulation

Predictive control systems base their operation on an optimization process with constraints that use a given model to predict how the system will behave over a discrete control horizon, and deriving optimal control actions from that prediction. When designing MPC controllers that need to generate a control action in less than a given time quantity, as is the case in control systems applied in real systems, it must be ensured that the optimization process converges in less time than specified. For this reason, the number of iterations and the convergence criterion (ftol) in the optimization process are the main parameters that fix the computation time of the controller and must be evaluated. If the number of iterations is reduced or the convergence criterion is lax, the computation time is reduced, but the control actions will not be optimal, increasing output error. Therefore, to find a compromise between the number of iterations and the convergence criterion to implement a robust controller in real-world conditions evaluating the reality gap is essential. Through testing, it has been validated that the MPC controller proposed in this works fulfills an established requirement of generating a control action every 100ms. This time constraint has been chosen through experience in design and by assessing the needs of a steering controller and the capabilites of the hardware employed.

Considering the factors mentioned above, the controller is designed with four tentatives (predictions) to cover a prediction horizon of 25m, which is valid for any velocity. The tentatives distribution is non-uniform to improve the controller behaviour. Starting at zero from the vehicle position, the first is at 25/8m, the second at 25/4m, the third at 25/2m and the fourth at 25m, as justified in Section 3.4.1.

If focusing on solving a non-convex optimization problem solutions associated with local minima are generated, differing from an optimal or global solution. The gradient descent algorithm, with momentum descent, which evaluates the rate of change of the variables to be optimized, or random stochastic gradient descent, mitigating the local minima problem. However, the optimization techniques based on Newton’s method also offer variants to avoid the problem, although they do not guarantee it either.

The optimization stage of a MPC controller usually involves non-convex oprimization, which leads to search spaces that contain local minima and therefore can render solutions that do not reach the global optimum. Algorithms such as gradient descent (and its variants), which evaluates the rate of change to the variables under optimization, or those of the family of Newton’s method, cannot deal with the complicated search space of a non-linear constrained cost function, even though they can be enhanced with techniques like the inclusion of stochastic terms or momentum concepts to improve its exploration capabilities.

Therefore, for cases where a minimization problem with constraints is involved, the techniques described above are not suitable, and therefore linear programming (LP) or nonlinear programming (NLP) methods need to be applied. The paper [23] describes a computationally very efficient learning/optimisation algorithm, which is of interest for this work because of its capacity to increase the number of tentatives for the optimisation process of the MPC controller, ensuring its real time characteristics. The optimisation process is usually based on backpropagation techniques to obtain the optimal fit of any neural system weights. However, there are other alternatives to backpropagation, such as a dendritic event-based processing, which efficiently communicates the error between iterations without sharing weights in the optimisation process. In practice, MPC controllers are solved with NLP algorithms with constraints due to the nonlinear nature of the objective function and constraints. Nowadays, there are a wide variety of optimizers such as sequential linear programming (SLP), sequential quadratic programming (SQP), sequential least squares programming (SLSQP) or generalized reduced gradient (CRG), among others. This family of algorithm relies on a Taylor approximation of the objective function near an initial point, from which the search space is explored. The optimizer used for the MPC controller in this work is SLSQP, due to its reduced computation time compared to other alternatives. The cost function, which includes the value of the potential field, the heading error, and the tire angle increments, is defined as

$$ \begin{array}{@{}rcl@{}} J &= &w_{pf} \sum\limits_{i = 1}^{N} Pf_{lane} (i) + \\ &&w_{\psi} \sum\limits_{i = 1}^{N} (\psi_{car} (i) - \psi_{ref} (i))^{2} + \\ &&w_{\delta} \sum\limits_{i = 1}^{N-1} (\delta(i) - \delta (i-1))^{2} \end{array} $$
(4)

where N is 4 and the weights are 1m− 1, 0.03rad− 1 and 0.2rad/s− 1 corresponding with the weights [wpf, wψ and wδ], respectively. The constraints considered for the optimization process are classified into two types, tire angle limits and yaw angle limits, and are defined as

$$ \begin{array}{@{}rcl@{}} && \underset{\delta}{\text{minimize}} J(\delta) \\ && {\text{subject to}} -60^{\circ} \leq \delta(i) \leq 60^{\circ},\quad {\forall{i \in{[1,4]}}}\\ && \qquad\qquad -5^{\circ} \leq \phi(\delta, i) \leq 5^{\circ}, \quad {\forall{i \in{[1,4]}}} \end{array} $$
(5)

The control horizon in the MPC implementation is 1. Thus, when the ego-vehicle is in time t, the control action applied in t + 1 will be the first tentative assessment by the controller. The control horizon is represented in Fig. 4 in red colour.

Fig. 4
figure 4

Spatial tentative distributions

3.4.1 Tentative distance distribution analysis

In practice and simulation, it is observed how the controller yields more or less lateral error depending on the approach used in the spatial distribution of tentatives due to the discretisation that the prediction horizon undergoes. The distribution chosen will guarantee the requirements of driving on any road type with lateral errors less than 30 cm, considering a good value for driving on any road type.

Figure 4 depicts the tested distributions. The uniform distribution is characterized by having equidistant tentatives at 25/4m. The non-uniform distribution considers the first sample at a distance of 25/8m, the second at 25/4m, the third at 25/2m, and the fourth at 25m from the ego vehicle. The uniform distribution shows greater lateral error when the curvature radius of the circulated path decreases, a situation given in curves or roundabouts. The non-uniform distribution reduces errors when driving around curves and roundabouts. The first distance interval is represented in green, and the time increment is fixed for both cases to 0.02s.

Due to security concerns, the performance of the controller must be assessed firstly on a simulation environment, where a sudden unstability generated by the spatial distribution of the tentatives of the MPC controller might not pose any risk. Once validated in the simulator, any distribution can be recreated with the prototype platform, and therefore undergo a process in which sim-to-real transfer viability and reality gap can be evaluated.

3.5 Reality gap analysis

Variables related to the controller’s performance and comfort are chosen as signals to evaluate the reality gap between simulation and real scenarios, since this is the behaviour pattern any simulation must aim to replicate with regards to its real counterpart. The similarity between two continuous signals will be the indicator used to evaluate the reality gap. The literature shows a high number of approaches based on correlation measurement to assess the similarity between two continuous signals. Some methods are the Pearson correlation coefficient (PCC), time-lagged cross-correlation (TLCC), windowed TLCC, dynamic time warping, instantaneous phase synchrony, or max normalized cross-correlation (MNCC). This work uses the PCC and the MNCC for reality gap assessment.

The PCC indicator represents the covariance of two continuous signals over time and denotes the linear relationship between 0 (not correlated, high reality gap) to 1 (perfectly correlated, low reality gap). Two essential things were considered in this indicator: (1) outliers can skew the results of the correlation estimation, and (2) the signals are homoscedastic such that the variance is homogeneous across the data range. The PCC is defined as

$$ \rho(X, Y) = \frac{\mathrm{E}[(X - \mu_{x})(Y - \mu_{y})]}{\sigma_{x} \sigma_{y}}, $$
(6)

where X and Y are two continuous temporal signals, μx and σx are the mean and standard deviation of signal X, and μy and σy are the mean and standard deviation of signal Y.

The MNCC measures how similar two continuous signals are. The result ranges between 0 (no similarity) and 1 (perfect similarity). The MNCC is defined as

$$ \beta(X, Y) = \frac{X * Y}{max((X * X), (Y * Y))}, $$
(7)

where (XX) and (YY ) are the autocorrelation of the X and Y signals, (XY ) is the cross correlation.

Section 4.1 presents the MPC performance in both settings and compare them with previous work [24]. Finally, Section 4.2 analyses comfort in both cases.

4 Results

This section presents the results and comments from the test of the MPC controller and the reality gap assessment, both in real world and simulation environments. Figure 5 illustrates the real-world and simulated circuit. The test environment includes a wide variety of driving elements, such as six straight sections, four curves with a radius of curvature of fewer than ten meter, and three roundabouts. The driving elements are labeled from 01 to 06, and Table 2 links the Id with a distance range for each Id. In total, the length of the circuit is approximately 3 km. The maximum and minimum traffic speeds are 40km h− 1 and 20km h− 1. The scenario is sufficiently representative with a variability of structural traffic elements, which adds robustness to the obtained results. Further, the proposed controller is compared with an existing MPC controller, developed for path tracking, based on ILQR as found in previous work [24]. Since the ILQR based system does not fulfill any safety criteria, the comparison was carried out only in the simulation suite.

Fig. 5
figure 5

Real and simulated path

Table 2 Driving elements and associated distance ranges

4.1 Reality gap assessment in MPC performances

In order to evaluate the reality gap, representative variables need to be selected to compare the performance of both the simulated and real world systems. Thus, from an experimental point of view it was decided to select the lateral error and the control action, namely the tire angle. Figures 6 and 7 depict the evolution of the lateral error and tire angle for four possible scenarios, two in simulation and two in real-world conditions, according to the two setups described previously regarding spatial tentative distribution. The blue and green signals represent the system’s behaviour when the spatial distribution of tentatives are uniform. In contrast, the orange and red signals represent the system’s behaviour when the spatial distribution of tentatives is non-uniform. In both cases, there is a notable similarity in the shape of the signals. A structural road image has been overlapped to depict the driven scenario.

Fig. 6
figure 6

Lateral error from real-world and simulated tests

Fig. 7
figure 7

Tire angle from real-world and simulated test

Table 3 presents the root mean square lateral errors for segments where there is a curve or roundabout in the circuit; distance intervals are detailed on Table 2. The maximum value of mean square lateral error is 0.42m, evaluated on curve 04 (110 with 5m curvature radius) when the setup is uniform in real-world conditions. In the same curve, the lateral error is reduced to 0.14m when a non-uniform distribution in real-world conditions is evaluated. Therefore, a non-uniform distribution reduces the lateral error up to 3 times. The behaviour in simulation is similar; the maximum lateral error is 0.32m and 0.09m on curve 04 when the setup is uniform and non-uniform tentatives distribution, respectively, reaching an improvement of 3.5 times.

Table 3 Lateral error and reality gap associated

With regards to reality gap, uniform tests show more similaritiy than nob-uniform ones, for both indicators. Thus the sim-to-real transfer is better when a non-uniform distribution is used. More than 91% and 71% of similarity is measured with PCC and MNCC on uniform tests, respectively. The mean percentage of similarity decreases with non-uniform tentatives in both indicators to 70% and 65%. So, both indicators depict a similarity decrement when a non-uniform tentative distribution setup is compared against uniform distributions. When dynamics are more demanding, the bycicle kinematic model and some CARLA features yield a poor representation of reality. Applying new sim-to-real transfer methods will be a solution to reduce the reality gap in this setup.

Table 4 presents the root mean square tire angle for segments where there is a curve or roundabout in the circuit; distance intervals are detailed on Table 2. The maximum tire angle value is 3:6, evaluated on curve 04 (110 with 5m 655 curvature radius) when the setup is uniform in real-world conditions. In the same curve, the maximum tire angle value is reduced to 3:5 when a non-uniform distribution in real-world conditions is evaluated, with a difference of 0:1. While the lateral error is reduced three times, the maximum tire angle is very similar in both cases, which prompts the conclusion that spatial distribution has little effect in the effort the controller needs to enact on the actuators. The analogy with a PID controller is an increment the derivative constant, reaching a faster PID behaviour. However, the maximum tire angle difference on simulation test increases to 0:4 on curve 04, where maximum tire angles are 5:9 and 6:3 for uniform and non-uniform distribution, respectively.

Table 4 Tire angle and reality gap associated

As in the previous case, the reality gap is studied with PCC and MNCC analysis. The uniform and non-uniform comparatives depict 99% with PCC, regardless of tentative configuration. However, signal correlation analysis shows a mean of 85% for both tentative formats. So, vehicle control actions on real and simulation tests are equal, as evidenced by both indicators. These good reality gap results (99% and 85%) prove that the steering control system tested in simulation and real world environments is the same.

To further validate the behaviour of our MPC controller, a comparison against the MPC steering controller explained on [24] is performed. This controller arises as an extension of the well-known linear quadratic regulator (LQR), which can solve both non-linear and constrained problems by linearising and incorporating restrictions in the performance index. The algorithm has a mean computation time of 2 s, not enabling the real-time feature. On the other side, Fig. 7 depicts a non-smooth tire angle dynamic, which yields an uncomfortable and unsafe experience.

4.2 Reality gap assessment in comfort indicators

Although indicators related to lateral error and tire angle were analyzed, these do not evaluate whether the driving actions are comfortable for vehicle occupants. Driving comfort analysis is carried out based on previous work [25], which describes a modeling occupant’s preference metric for setting a maximum and minimum value of acceleration and jerk along the transversal axis of the vehicle that ensures a comfortable driving experience. Figure 8 depicts the signal jerk assessment on real and simulation tests. The maximum values of acceleration and jerk across the vehicle are -2m s− 2 and 0.75m s− 3, that are within a comfortable driving margin of ± 4ms s− 2 and 0.9m s− 3 as described in [25]. These variables depend on linear speed and curves/roundabout topology. A reference velocity of 20kmh− 1 was used throughout the circuit, except in curve 04 (110 with 5m curvature radius) where it is reduced to 5km h− 1 The lateral acceleration is not shown due to the smooth results recorded, which reveal no additional information.

Fig. 8
figure 8

Lateral jerk behaviour. a Blue line: Uniform Test on Simulation. b Orange line: No Uniform Test on Simulation. c Green line: Uniform Test on Real. d Red line: No Uniform Test on Real. e Dash line: ILQR on Simulation. f Yellow line: Comfortable Jerk limits

As in the previous case, the reality gap is studied with PCC and MNCC analysis, separating uniform and non-uniform tests. Table 5 shows similarity percentage values for each segment on the test; the last row depicts the mean column value for every test. The Pearson indicator shows a mean value higher than 89% in all tests. On the other hand, MNCC analysis shows 80% of similitude on lateral acceleration for uniform and non-uniform configuration. Furthermore, 60% of similitude on jerk across vehicle with a uniform distribution is reached, except for in roundabout 02 where the similitude value is 29%, with a value of 0.6m s− 3. Finally, 68% of similitude on jerk across vehicle with a non-uniform tentatives distribution is reached, but with a similar exception as in the previous case with worse results from roundabout 02 where similitude value is 40% due to a signal amplitude difference. This results show that the behaviour of both the real and virtual MPC controllers differs noticeably in regards to comfort and safety, which is further reinforced by the results shown in Table 3.

Table 5 Acceleration and jerk across vehicle reality gap

The controller derived in [24] does not fulfill transversal jerk comfort constraints with maximum values of 2:5ms− 3 as shown in Fig. 8.

Finally, a robustness and repeatability analysis of the MPC controller is depicted in the Figs. 9 and 10, showing the difference in the behaviour of the tire angle and the lateral error when the test circuit is run five times in a simulation environment. The comparisons are made concerning the first test carried out. The results achieved are an average over the distance travelled with a value of 0.2o to the tire angle and 1cm to the lateral error saved by the controller. We believe that the results demonstrate the repeatability of the controller’s behaviour and its robustness.

Fig. 9
figure 9

Tire angle comparative; five laps run

Fig. 10
figure 10

Lateral error comparative; five laps run

5 Conclusions and future work

This paper adresses a novel measurement technique for sim-to-real transferability in the realm of autonomous driving. A statistic analysis of acquired data in both real-world and simulation environments has been showcased in order to improve the validation stage of the design process of a steering controller, which in our case has been exemplified with the use of an MPC system based on adaptive potential fields. The proposed framework presents a procedure in which both data streams are studied to provide a degree of correlation between the behaviour of the system in a simulated environment and its real counterpart, so that the designer can have an accurate expectation of how the real system will perform after having been designed in a full-software workflow.

The statistical concepts used to evaluate the reality gap are the Pearson correlation coefficient (PCC) and max normalized cross-correlation (MNCC). These metrics are applied over ego-vehicle behaviour (steering, lateral error) and comfort signals (lateral acceleration, jerk). The behaviour of the controller is evaluated on lateral error and tire angle, and comfort is evaluated using lateral acceleration and jerk. Our studies rendered a PCC of 99% and a MNCC greater than 85% between both tire angle signals, and a PCC of 85% and MNCC of 93% when analysing lateral jerk. The similarity results evidence that simulation approaches, such as a complete design procedure within a simulation suite such as CARLA are beneficial to designing trajectory tracking systems.

Another contribution of this work is the study of control tentative distribution over the prediction horizon of the proposed MPC controller. Our tests show that, while both configurations render a capable and safe controller, non-uniform distributions reduce lateral error up to 32%, achieving a maximum lateral error of 60cm in the most challenging point of the test track. However, correlation results on both PCC and MNCC are suboptimal, which prompts a future line of work that derives a specific set of tests and metrics to assess a method of correction within the simulation environment to improve the design process when applying a non-uniform spatial distribution.

Comfort analysis is crucial in a simulation design process when a path tracking system is designed. Hence, the paper identifies limits on lateral acceleration and jerk values to assess the performance of the proposed controller in what relates to percieved comfort of the occupants of the vehicle. As demonstrated by the extensive analysis of results present in previous sections, our controller performs adequately and stays below the range that has been identified as comfortable via real-world testing. Besides, the correlation results pertaining both signals involved further reinforce the benefits of a full-software desing. On this note, a comparison between the proposed MPC controller and a ILQR based one has been carried out. The results of this comparison seem to indicate that our implementation has considerable benefits over an optimal control derived one, especially regarding passenger comfort and safety.

A clear benefit of taking into account the reality gap into the desing process of the MPC controller arises from the possibility of extending its capabilities beyond trajectory tracking with the inclusion of other autonomous driving subsystems. One of our future lines of work will reside in the design of a path planning system that responds to traffic issues such as obstacle avoidance and overtaking maneuvers. Other future improvements to the system include the implementation of a more complex movement model, which would further reduce sim-to-real transferability, and an enhancement of the software implementation, which would lower computational demand and therefore allow to extend the number of tentatives the MPC controller can consider or the number and nature of the constraints included in its optimization stage, which would render a safer, more comfortable experience. In this regard, the ILQR implementation described previously, which we compared our proposed controller against, includes a novel implementation using auto-gradients and both backward and forward differentiability which makes for computation periods in the microseconds range, and therefore allows to substantially increase the number of tentatives considered. A similar implementation of the optimization procedure on our MPC controller has been identified as a future line of work.

If a sufficiently powerful reality gap analysis is performed, one of its main consequences resides in the design of neural steering controllers in a full-software setup, based on the beahviour of an MPC controller that has been sufficiently adapted to replicate the performance of a real implementation. With this controller, a wide array of driving situations can be designed and tested in order to build a database with which to train any sort of neural control structure that would further improve the real-time capabilities of the system.