Keywords

1 Introduction

Any robotic application like Autonomous Driving requires knowledge of its location, i.e. position, orientation, velocity, etc. to provide autonomous functionality. The measurements obtained from different sensors cannot be directly interpreted as the vehicle state due to various challenges including, noisy measurements, asynchronous sensor readings, discrepancies in the sensor measurement format compared to state representation, etc. Hence, the state is estimated using the measurements obtained from different sensors. In literature, these state estimation algorithms can be broadly categorized into filtering-based built upon the Kalman Filter and its different variants: Extended Kalman Filter (EKF) [2, 8], Unscented Kalman Filter [1, 8], and Graph-based methods either using a pose graph [12] or a factor graph [4, 5]. While Filtering-based approaches are primarily employed for state estimation, graph-based approaches are the primary choice for Simultaneous Localization and Mapping (SLAM). With the advancement of learning-based algorithms, some works in literature use deep learning techniques to perform state estimation. [9,10,11, 14]. Even though deep learning-based algorithms provide reliable solutions, they are often constrained to the systems they are trained for due to measurable domain gaps and require the ground truth data to optimize the algorithm.

The representation of the Vehicle state varies across different works, with each adopting distinct formats and modeling approaches for their components. Among these components, side slip angle estimation proves to be one of the most challenging, particularly within the context of modeling vehicle lateral dynamics. The absence of a dedicated sensor for direct side slip angle measurement has prompted the exploration of various techniques in the literature, often involving GNSS and optical sensors. However, the reliability of GNSS sensors in urban canyons and the cost associated with optical sensors present significant challenges. The UKF filtering approach in [1] models the vehicle motion using the single-track vehicle kinematic model and doesn’t explicitly compute the side slip angle. [2] address this by including vehicle side slip angle in state representation and modeling vehicle motion with a single-track version of the four-wheel vehicle model, however, the filtering-based approach is not modular and hence not designed to include additional sensors into the framework. The authors in [3] propose a state estimator based on a factor graph. This estimator models the vehicle’s lateral dynamics using a single-track vehicle dynamics model. Nevertheless, it is noteworthy that this research is confined to representing only the vehicle’s lateral dynamics. Unfortunately, this limitation restricts the incorporation of additional sensors such as the Inertial Measurement Unit (IMU), GNSS, LiDAR, Cameras, and others into the broader state estimation framework. Our earlier work, [4] is similar to the proposed architecture, but only tests the inclusion of the vehicle kinematic factor into the factor graph without explicitly modeling vehicle lateral dynamics.

In this work, we propose a factor graph-based state estimation architecture that takes the single-track vehicle dynamics model as the reference factor and integrates GNSS measurements and LiDAR odometry for vehicle state estimation. We demonstrate the superiority of the proposed algorithm in terms of modeling vehicle lateral dynamics compared to the baselines and its modularity in integrating multiple asynchronous external sensors enabling robust state estimation in GNSS outage scenarios. The major contribution of the proposed framework is threefold:

  • We propose a novel and modular factor graph-based state estimation framework that can utilize measurements from IMU, LiDAR, GNSS, and In-Vehicle sensors (i.e. velocity and steering encoders) to provide consistent and reliable state estimation.

  • Filter-like implementation of the architecture enables real-time state estimation at high frequency. The frequency can be regulated to meet user’s need.

  • We demonstrate that the proposed algorithm can provide robust and accurate lateral dynamics state estimate when compared to the IMU preintegration based factor graph [4] and baseline EKF [2].

Fig. 1.
figure 1

Left: Overall Architecture of the proposed factor graph based state estimator illustrating inputs from multiple sensors, Right: Vehicle Sensor setup and Reference frames definition for problem formulation

2 Problem Formulation

For state representation, we take the position of the IMU as the reference point and assign it as the origin of the Vehicle Reference Frame (V). \({X}_t\) represents the state or values for the factor graph,

$$\begin{aligned} {X}_t = \begin{bmatrix} _Ix^G, _Iy^G, _I\psi ^G, V_x^I, \beta ^I,\omega ^I \end{bmatrix}_t \end{aligned}$$
(1)

where, [\( _Ix^G\), \( _Iy^G\)] and \( _I\psi ^G\) are the IMU’s position and orientation in the Global Reference Frame (G). \(V_x^I, \beta ^I\), and \(\omega ^I\) are longitudinal velocity, side slip angle and yaw rate in IMU frame. The subindex t corresponds to the time instance.

If we represent \({\textbf {X}}^G_t\) as a set of the factor graph state till time instance t, the algorithm needs to estimate the posterior distribution \(p({\textbf {X}}^G_t|{\textbf {Z}}_t)\). Where \({\textbf {Z}}_t\) represents all the measurements from sensors till time instance t. Estimation can be performed using the Maximum A Posteriori (MAP) inference, for \({\textbf {X}}^G_t\), which is represented as:

$$\begin{aligned} {\textbf {X}}^G_t = \underset{{\textbf {X}}^G_t}{argmax }(p({\textbf {X}}^G_t|{\textbf {Z}}_t)) \end{aligned}$$
(2)

while we can approximate the posterior \(p({\textbf {X}}^G_t|{\textbf {Z}}_t)\) using the Baye’s law as,

$$\begin{aligned} p({\textbf {X}}^G_t|{\textbf {Z}}_t) \propto p({\textbf {Z}}_t|{\textbf {X}}_t^G) p({\textbf {X}}_0) \end{aligned}$$
(3)

where, \(p({\textbf {Z}}_t|{\textbf {X}}_t^G)\) is the measurement likelihood density and \(p({\textbf {X}}_0)\) is the prior of the states. Given the measurements obtained from the LiDAR in the point cloud format, linear acceleration, and angular velocities measurements from IMU, global position measurements in the latitude, longitude, and altitude format from the GNSS sensor, steering angle measurement from the steering encoder and longitudinal velocity measurement from the velocity encoder, the algorithm provides the estimates of \({\textbf {X}}_t^G\) at time instance t by solving Eq. (3) for the respective states.

The overall architecture of the algorithm is illustrated in Fig. 1 and detailed representation is provided in Fig. 2. The factor graph is designed to accommodate the vehicle dynamics factor as a base factor and add additional sensor measurements as per their availability.

2.1 Factor Graph

A factor graph is constructed by taking the synchronized In-Vehicle sensor’s measurements as the reference points in a temporal sense. These sensors have the highest frequency in most sensor setups and hence can provide connection points to other sensors with a high degree of flexibility. Using the Robot Operating System (ROS) framework, the factor graph is constructed by taking their timestamps as the reference points, and new values are added to the graph concurrently. The modular nature of the graph enables the addition and removal of sensors based on their availability. The sensor setup consists of 2 GNSS Sensors, located at the front and the rear of the vehicle operating at 10 Hz, one Hesai 32 plane LiDAR operating at 20 Hz, one Bosch 5-axis IMU, and In-Vehicle sensors, all operating at 100 Hz.

Filter Like Architecture. The posterior distribution in Eq. 3 is proportional to the product of a prior distribution \(p({\textbf {X}}_0)\) and the measurement likelihood \(p({\textbf {Z}}_t|{\textbf {X}}_t^G)\). This allows for an iterative solution to the MAP problem by moving the optimization horizon into a computationally tractable threshold [5]. The MAP inference referred to by Eq. 2 is solved by using factor graph [13] in this work. A factor graph is a bipartite graph, \(\mathcal {G} = (\mathcal {V},\mathcal {F})\) with \(\mathcal {V}\) being the set of values, \(\mathcal {F}\) being the set of factors. For a given factor graph, MAP inference in Eq. 2 can be rewritten as a product of multiple factors:

$$\begin{aligned} {\textbf {X}} = \underset{{\textbf {X}}}{argmax }\prod _i \psi _i(X^i) \end{aligned}$$
(4)

where the subscript i refers to the factor number. Each factor can be a result of measurement or constraint received from different sensors. Assuming that each factor is of a type,

$$\begin{aligned} \psi _i(X^i) \propto exp\{ -\frac{1}{2}||h^i(X^i)-z^i||_{\varSigma _i}^2 \} \end{aligned}$$
(5)

measurements from different sensors can be easily integrated into the factor graph if the measurement model given by \(h^i(X^i)\) can be defined. The MAP inference problem can be modeled as a least square optimization problem with the residual error given by \(e^i = h^i(X^i)-z^i\). \(z^i\) is a measurement received from a sensor. The measurements are assumed to be normally distributed with zero mean and \(\varSigma _i\) covariance. This factorized implementation facilitates the modular architecture of the factor graph-based algorithm and hence enables sensor fusion for measurements coming from different sensors with different modalities.

Fig. 2.
figure 2

Factor Graph construction displaying all the factors

In this work, we implement a filter-like factor graph-based state estimator taking inspiration from the work in [5]. Given that In-Vehicle sensors are the sensor with the highest frequency in most vehicles, its timestamp is used to add values to the factor graph. Having established a factor graph with reference values at high frequency, a variety of sensors can be easily integrated. Most control algorithms require the state to be estimated at a high frequency for smooth operation, however, LiDAR odometry and GNSS measurements are usually available at a lower frequency of (10–20) Hz. This filter-like implementation is designed using predict-optimize iterations to provide state estimates at In-Vehicle sensors’ frequency. With each In-Vehicle sensor measurement, the state is predicted to the next time instance using single-track vehicle dynamics model, and once a LiDAR or GNSS factor is added to the graph, it is optimized to obtain an updated or optimized state.

2.2 Vehicle Dynamics Factor

To construct the factor graph within the framework, the vehicle dynamics are incorporated into residual error functions aimed at minimizing through factor graph optimization. Equation 6 represents the residual errors pertaining to vehicle dynamics and connects two states or values, denoted as \(X^k\) and \(X^{k-1}\) within the graph. Here, \(v_y\), \(v_x\), and \(\omega _z\) are the lateral, longitudinal, and angular velocities of the vehicle, respectively. Additionally, \(C_f\) and \(C_r\) denote the cornering stiffness of the front and rear tires, while \(l_f\) and \(l_r\) represent the distances from the vehicle’s center of gravity to the front and rear axles, respectively. The steering angle \(\delta \) and longitudinal acceleration \(a_x\) are measured from the steering encoder and IMU. A single dynamic factor is established by linearizing Eq. 6. In the factor graph, values are added at the frequency of the In-Vehicle sensors, and two consecutive values are linked by factors, as depicted in Eq. 6.

$$\begin{aligned} \begin{aligned} e_{x}^{k-1} &= x^k - x^{k-1} - \varDelta t(v_x^{k-1}(\cos (\psi ^{k-1}) - \sin (\psi ^{k-1})\beta ^{k-1})) \\ e_{y}^{k-1} &= y^k - y^{k-1} - \varDelta t(v_x^{k-1}(\sin (\psi ^{k-1}) + \cos (\psi ^{k-1})\beta ^{k-1})) \\ e_{\psi }^{k-1} &= \psi ^k - \psi ^{k-1} - \varDelta t \omega _z^{k-1} \\ e_{v_x}^{k-1} &= v_x^k - v_x^{k-1} - \varDelta t a_x^{k-1} \\ e_{\beta }^{k-1} &= \beta ^k - \beta ^{k-1} + \varDelta t \left( -\frac{C_f + C_r}{m v_x^{k-1}}\beta ^{k-1} - \left( \frac{C_f l_f - C_r l_r}{m(v_x^{k-1})^2} + 1\right) \omega ^{k-1} + \frac{C_f \delta ^{k-1}}{m v_x^{k-1}}\right) \\ e_{\omega _z}^{k-1} &= \omega _z^{k} - \omega _z^{k-1} - \varDelta t \left( -\frac{C_f l_f - C_r l_r}{J_z}\beta ^{k-1} - \frac{C_f l_f^2 + C_r l_r^2}{J_zv_x^{k-1}} + \frac{C_f l_f \delta ^{k-1}}{J_z}\right) \\ \end{aligned} \end{aligned}$$
(6)

2.3 Measurements Factor

The measurement factor incorporates lateral acceleration \(\hat{a}_y^k\) and yaw rate \(\hat{\omega _z}\) measurements from IMU. It also includes longitudinal velocity measurement, \(\hat{v_x}\) from the velocity encoder. This integration is expressed through the residual of these measurements connected with the value at time instance k, denoted as \(X^k\), as illustrated in Eq. 7. Additionally, the steering angle at time instance k, \(\delta ^k\) is also considered in the formulation.

$$\begin{aligned} \begin{aligned} e_{\hat{v_x}}^k = \hat{v_x} - v_x^k \\ e_{\hat{\omega _z}}^k = \hat{\omega _z} - \omega _z^k \\ e_{a_y}^k = \hat{a}_y^k + \frac{C_f+C_r}{m}\beta ^k + \frac{C_fl_f - C_rl_r}{m v_x^{k}}\omega _z^k - \frac{C_f\delta ^k}{m} \end{aligned} \end{aligned}$$
(7)

2.4 GNSS and LiDAR Factors

GNSS sensor provides the location of the vehicle in the frame G. The sensor setup has two RTK-corrected GNSS sensors located at the front and rear of the vehicle and hence can also provide information on the vehicle’s global yaw and pitch angle. The GNSS measurements are transformed into the IMU reference point of the vehicle and are subsequently added to the factor graph as a Unary factor using the known static transformation between the GNSS sensor and the IMU sensor.

$$\begin{aligned} \begin{aligned} ^{I}\hat{p}^{G} = \hat{p}^{G}_t + _{GN}R^I _{GN}p^I \end{aligned} \end{aligned}$$
(8)

where \( ^{I}\hat{p}^{G}\) is the GNSS measurement transformed to the IMU location in the vehicle expressed in the G frame. \( _{GN}R^I\) and \( _{GN}p^I\) represent the rotation and translation component of the static transformation between IMU and the GNSS sensor.

To compute LiDAR Odometry we employ a lightly coupled odometry computer based on the work of [6]. In this approach, the odometry computer directly subscribes to the point cloud topics obtained from the LiDAR sensor and is used as a binary factor between two values corresponding to the timestamps of the consecutive LiDAR scan. The relative transformation between the two consecutive point clouds is used as the factor between the values in the graph closest to these timestamps.

3 Implementation

We design the factor graph and perform state estimation using the Fixed Lag Smoother in the GTSAM [7] framework. We use the ROS framework to establish callbacks to acquire sensor measurements and test real-time applicability. The proposed algorithm is validated against two model-based state estimators: an EKF [2] and an IMU preintegration-based factor graph [4]. We acquire test data with difficult maneuvers to demonstrate the algorithm’s ability to model vehicle lateral dynamics and its modularity to integrate arbitrary other sensors in the state estimation framework. We use a Datron optical sensor to measure the vehicle’s longitudinal and lateral velocities and validate the estimated state against it in complex maneuvers.

4 Results

A series of slalom maneuvers is performed to create the first scene. The trajectory of the vehicle’s motion along with the states estimated by the proposed algorithm and the baselines are illustrated by Fig. 3. Position estimation from the proposed algorithm and the baselines are very similar as they both use the RTK-corrected GNSS sensors, however, a significant improvement can be seen in the estimation of the side slip angle. We can observe that the side slip angle estimation from the proposed factor graph that utilizes a vehicle dynamics model for motion prediction matches closely to the estimation from the EKF and the Datron side slip computation, while estimation from the IMU factor graph is significantly worse in maneuver end situations.

Fig. 3.
figure 3

Top Figure: Ground truth and estimated vehicle trajectories using the proposed and the baselines algorithms. The green circle indicates the start point while the arrow shows the vehicle motion direction. Bottom Figure: Ground truth and estimated side slip angles using the proposed and baseline algorithms

Table 1. Root Mean Square Error (RMSE) values for estimated states (Position, Yaw angle values) computed using the baselines EKF [2], IMU Graph [4] and the proposed algoritm
Fig. 4.
figure 4

Top Figure: Ground truth and estimated vehicle trajectories expressed in GRF. The green and red circles indicate the start point and GNSS failure point while the arrow shows the vehicle motion direction. Bottom Figure: Ground truth and estimated side slip angle in radians

In the second scenario, we use two scenes as shown in Fig. 4, to validate the algorithm, where we investigate a GNSS failure case. We eliminate the availability of both GNSS measurements to the algorithm after a certain time instance from the algorithm’s initiation and analyze its performance in terms of state estimation. This same procedure is applied to the baseline algorithms. Note, for the baseline EKF, the IMU and In-Vehicle sensors’ measurements are made consistently available and update respective states even in the absence of GNSS measurements in the failure scenario. Ground truth for position and yaw angle is computed using two RTK-corrected GNSS positions. Notably, since GNSS positions are unavailable from 5 s onward for Scene 1 and 50 s onward for Scene 2 into the algorithm run, the unused GNSS measurement serves as a reliable ground truth reference. This further emphasizes the robust performance of the proposed dynamics graph under GNSS failure conditions. The qualitative analysis presented in Fig. 4a reveals that the proposed algorithm consistently outperforms both the EKF [2] and the IMU factor graph [4] in terms of vehicle position estimation. Moreover, the estimation of the side slip angle from the proposed graph appears to be more accurate compared to the IMU graph and EKF. A similar level of performance is observed in the GNSS failure scenario in Scene 2, illustrated by Fig. 4b. The quantitative results are depicted in Table 1, supporting the findings and show that the proposed dynamics graph consistently outperforms both baseline algorithms in positional and yaw angle estimation. In total, we can observe up to \(60\%\) and \(23\%\) performance improvement on average, in terms of positional and yaw angle estimation respectively compared to the IMU graph.

5 Conclusion

We present a novel factor graph-based state estimation algorithm. The algorithm uses a vehicle dynamic factors to establish a base factor graph and integrates multiple asynchronous sensors like LiDAR, GNSS, etc. to provide consistent and accurate state estimation at high frequency. We validate the algorithm using data collected by an in-house vehicle with the defined sensor setup and multiple scenarios, where we simulate the GNSS sensor failure to check the reliability of the algorithm in urban settings. We were able to demonstrate that the proposed algorithm is highly modular and can seamlessly integrate multiple asynchronous sensors while providing reliable and robust state estimates compared to the baselines.