1 Introduction

The history of odometry in robotics has seen a significant evolution, marked by key milestones and influential literature [63, 96, 159]. In the early stages, odometry heavily relied on wheel encoders and dead reckoning methods [28]. However, the accuracy of wheel odometry was constrained by sensor errors stemming from wheel slippage and algorithmic inaccuracies. During this phase, researchers explored alternative approaches, shifting their focus to other sensors, such as range sensors and visual sensors. There was a concurrent surge in the field of computer vision, witnessing rapid developments in visual odometry studies [38, 111, 121]. Simultaneously, studies emerged concentrating on obtaining odometry through the use of range sensors [92, 93], along with the advancement of scan registration algorithms such as Iterative Closest Point (ICP) [9]. These two major research streams—range sensor-based odometry and visual odometry—represent a critical juncture in the historical evolution of robotic odometry.

Fig. 1
figure 1

Structure of LiDAR odometry survey. Section 2 explores the intricacies of LiDAR technology. Sections 3, 4, 5 and 6 investigate the LiDAR odometry under different sensor modalities. Section 7 introduces the ongoing challenges in LiDAR odometry. Finally, Sect. 8 presents the public datasets, evaluation metrics, and benchmark results

Further into this period, range sensors advanced and 3D LiDAR emerged as a transformative technology capable of measuring the surrounding space in 3D, surpassing traditional 2D measurements. Despite substantial progress, visual odometry faces limitations, particularly in low-light conditions, restricting its applicability, such as during nighttime operations. Recognizing the importance of precise location data for autonomous robots in decision-making [58, 60, 97, 118, 119, 147], researchers turned their attention to LiDAR, which scans the surroundings in 3D while remaining unaffected by lighting conditions. This led to a rapid evolution in range sensor-based odometry using LiDAR [125, 126, 162]. This evolution prompts a focused review of odometry works leveraging LiDAR.

In previous research, Mohamed et al. [96] extensively reviewed approaches of odometry, placing a particular emphasis on visual-based methods. Conversely, Jeon et al. [59] presented a survey specifically tailored for unmanned aerial vehicles (UAV), focusing on the performance of visual odometry algorithms when implemented on NVIDIA Jetson platforms. Their assessment considered factors such as odometry accuracy and resource utilization (CPU and memory usage) across different Jetson boards and trajectory scenarios. Wang and Menenti [143] summarized the major applications of odometry, pointing out an expected shift toward addressing challenges in the field. Meanwhile, Li and Ibanez-Guzman [82] provided a detailed review of automotive LiDAR technologies and associated perception algorithms, exploring various components, advantages, challenges, and emerging trends in LiDAR perception systems for autonomous vehicles. Focusing on LiDAR-only odometry, Jonnavithula et al. [63] categorized existing works into point correspondence, distribution correspondence, and network correspondence-based methodologies. They also conducted performance evaluations for LiDAR-only odometry literature. Similarly, Zou et al. [180] performed a comprehensive analysis and comparison of LiDAR simultaneous localization and mapping (SLAM) for indoor navigation, detailing strengths and weaknesses across real-world environments.

Notably, our review addresses a gap observed in existing surveys. While previous works have delved into specific aspects of LiDAR odometry, none have completely covered all methodologies. Therefore, our review aims to provide a thorough examination, encompassing not only LiDAR-only odometry but also approaches that successfully integrate other sensors for accurate LiDAR odometry.

The structure of this survey, illustrated in Fig. 1, unfolds as follows: Sect. 2 initiates an exploration of LiDAR sensors. Subsequently, we categorize LiDAR odometry based on sensor modality and delve into each category within respective sections. Section 3 is dedicated to methods that solely rely on LiDAR, while Sect. 4 outlines LiDAR odometry works that integrate IMU sensor with LiDAR. Section 5 provides insights into odometry employing multiple LiDARs. In Sect. 6, we examine the fusion of LiDAR sensor with other sensors, such as a camera. Following this, we delve into the unresolved challenges within LiDAR odometry. Finally, our survey concludes by discussing available public datasets and evaluation metrics, supplemented by the presentation of benchmark results. The key contributions of this paper are as follows:

  • Our paper offers a comprehensive review of LiDAR odometry following the progression of the technology. We categorize the review into the following sections: LiDAR preliminary, LiDAR-only odometry, LiDAR-inertial odometry, multiple LiDARs, and fusion with other sensors.

  • Our paper explores unresolved challenges in LiDAR odometry, offering insights and directions for future research. By addressing these challenges, we aim to catalyze advancements that enhance the accuracy and robustness of LiDAR odometry.

  • Our paper scrutinizes existing public datasets, highlighting their distinctive characteristics. Furthermore, we provide an overview of the evaluation metrics utilized in relevant studies and present benchmark results.

2 LiDAR preliminary

To understand the progress and challenges in LiDAR odometry, it is essential first to grasp the basics of LiDAR sensors. This section investigates the fundamental principles and different categories of LiDAR sensors.

2.1 Light detection and ranging

LiDAR, an acronym for Light Detection And Ranging, is a powerful remote sensing technology employed for measuring distances and constructing highly detailed 3D representations of objects and environments [67, 117, 146]. The sensing process commences with a LiDAR system emitting laser pulses toward a designated area. When these pulses encounter obstacles, a portion of the light reflects back to the LiDAR sensor. Measuring the time each laser pulse takes to return and leveraging the constant speed of light, LiDAR calculates the distance to the target.

Applied systematically across large areas and synthesized into distance measurements, LiDAR produces a point cloud—a collection of numerous points in 3D space. These points effectively map the 3D shape and features of the area or object. In essence, LiDAR facilitates the creation of highly detailed and accurate 3D representations of the surrounding world, proving invaluable in various fields such as geospatial mapping [18, 37], autonomous navigation [63, 180], and environmental monitoring [145, 169].

2.2 LiDAR categorization

LiDAR can be categorized based on their distinct imaging architectures and measurement principles, as extensively discussed in previous survey [117]. Imaging mechanisms of LiDAR can be classified into three main categories: mechanical LiDARs, scanning solid-state LiDARs, and flash LiDARs with non-scanning architectures. Regarding measurement principles, the primary types comprise pulsed Time of Flight (ToF), Amplitude Modulated Continuous Wave (AMCW), and Frequency Modulated Continuous Wave (FMCW) LiDARs. Additionally, LiDARs can be further sub-classified based on attributes such as detection range, field of view (FOV), and wavelength, as discussed in other literature [8, 73]. However, in this paper, we concentrate on mechanical LiDARs, scanning solid-state LiDARs, ToF LiDARs, and FMCW LiDARs, as these variants of LiDAR hold significant relevance in the context of LiDAR odometry.

2.2.1 Imaging mechanisms

Mechanical LiDARs, one of the most established configurations, operate using a rotating assembly to direct a laser beam across different angles. While mechanical LiDAR has proven reliable in measurement quality, it is subject to limitations associated with its mechanical components. These include susceptibility to degradation over time, necessitating regular maintenance to ensure optimal functionality. The inherent moving parts can also result in slower data acquisition speeds and increased vulnerability to vibrations and external shocks.

In contrast, scanning solid-state LiDAR systems eliminate the need for mechanical rotation with diverse mechanisms. Some apply Mirror Microelectromechanical (MEMS) [52] technology, which utilizes a stationary laser directed at the small electromechanical mirrors, adjusting the tilt angle with input voltage difference as a substitute for rotational components. Another solution is adopting an optical phased array (OPA) [48] system. OPA establishes phase modulators to modulate the wave shapes similarly to a phased array radar.

Particularly, scanning solid-state LiDAR with Risley prisms [84] represents a notable innovation in LiDAR community. Risley prisms allow rapid and controlled beam steering without physical movement, resulting in a more compact and robust system suitable for demanding applications. Despite the disadvantages of limited FOV, this mitigates potential issues related to component degradation and extends the LiDAR system’s operational lifespan. Their intricate scanning patterns also ensure exhaustive environmental mapping, a critical aspect for achieving reliable LiDAR odometry. Figure 2 visually represents distinguishing scanning patterns of LiDARs.

2.2.2 Measurement principles

ToF LiDAR operates by emitting laser pulses and measuring the time it takes for these pulses to return after bouncing off a target. The distance to the target is calculated using the speed of light and the time the laser pulse takes. This straightforward method provides high-resolution distance measurements, making it a popular choice. However, one limitation of ToF LiDAR is its susceptibility to external light sources, which can reduce the signal-to-noise ratio (SNR) [72].

On the other hand, FMCW LiDAR executes by continuously projecting light with a varying frequency and analyzing the frequency shift of the reflected light. This frequency shift is directly proportional to the target’s distance, enabling precise distance measurements. FMCW LiDAR offers several notable advantages, including inherent resilience to interference due to its continuous wave signal, which helps mitigate issues caused by multi-path reflections. Moreover, FMCW LiDAR provides the relative velocity of the objects by analyzing the frequency shift, which proves particularly valuable in dynamic environments. However, it is important to note that FMCW LiDAR systems tend to be more intricate and potentially pricier compared to ToF LiDARs.

LiDAR technologies, each possessing unique strengths, play an integral role in LiDAR odometry. Tailored to diverse operational needs, they can provide a range of options for capturing accurate depth data across different applications.

Fig. 2
figure 2

Diverse LiDAR Scanning Patterns. The figure depicts repetitive and non-repetitive patterns among the LiDAR scanning patterns. In (a), Velodyne VLP-16, a mechanical LiDAR, shows a vertical channel-based repetitive pattern. In (b), Livox Mid-70, a scanning solid-state LiDAR with Risley prisms, displays a unique lotus-shaped, non-repetitive pattern

3 LiDAR-only odometry

LiDAR-only odometry determines a robot’s position by analyzing consecutive LiDAR scans. This involves the application of scan matching, a well-known technique in computer vision, pattern recognition, and robotics. LiDAR-only odometry can be classified into three types based on how scan matching is performed: (1) direct matching, (2) feature-based matching, and (3) deep learning-based matching. A summary of the LiDAR-only odometry literature is listed in Table 1.

Table 1 The overarching summary of LiDAR-only odometry

3.1 Direct matching

The direct matching method directly calculates the transformation between two consecutive LiDAR scans, representing the most straightforward approach in LiDAR-only odometry. The ICP algorithm [9] is a commonly used technique for estimating this transformation iteratively by minimizing an error metric, typically the sum of squared distances between the matched point pairs. Robot odometry is derived by calculating the transformation between each pair of consecutive scans using the ICP algorithm. However, the ICP algorithm has drawbacks, including susceptibility to local minima, which necessitates a reliable initial guess. The algorithm is also sensitive to noise, such as dynamic objects. Additionally, its iterative nature can result in computational expense, sometimes causing prohibitively slow computation speed. Consequently, substantial efforts have been dedicated to enhancing the performance of the ICP algorithm for improved odometry.

TrimmedICP (TrICP) [26] enhances the conventional ICP algorithm by employing the least trimmed squares method instead of the standard least squares method. This modification improves computation speed and robustness by minimizing the sum of squared residuals for a subset of points with the smallest squared residuals. Point-to-plane ICP, introduced by Chen and Medioni [25], refines the performance of the traditional point-to-point ICP by incorporating information about prevalent planes in real-world situations. Generalized-ICP [122] integrates point-to-point ICP and point-to-plane ICP within a probabilistic framework, leveraging the covariance of points during the minimization step. This approach maintains the speed and simplicity of the standard ICP while demonstrating superior robustness against noise and outliers. NICP [123] extends Generalized-ICP by evaluating distances in 6D space, including 3D point coordinates and corresponding surface normals in the measurement vector. LiTAMIN [156] and LiTAMIN2 [157] support faster registration through point reduction and modify the cost function of traditional ICP for robust registration.

Paired with the ICP algorithm, the Normal Distribution Transform (NDT) [10] algorithm provides an alternative that eliminates the challenging task of establishing point correspondences. The NDT algorithm aligns two point clouds by creating a normal distribution associated with the point cloud. It determines a transformation that aligns the point clouds based on the likelihood within the spatial probability function. Hong and Lee [53] enhance the conventional NDT algorithm by introducing a probabilistic NDT representation. They assign probabilities to point samples, addressing the degeneration effect by incorporating computed covariance. Their study demonstrates that probabilistic NDT outperforms traditional NDT in odometry estimation.

Despite advancements in scan-to-scan matching algorithms, their accuracy is inherently limited. Consequently, recent LiDAR odometry works predominantly estimate the robot’s pose by utilizing both scan-to-scan and scan-to-map matching. IMLS-SLAM [32] estimates odometry through Implicit Moving Least Square (IMLS) representation-based scan-to-map matching. DLO [20] creates a submap for scan-to-map matching by combining point clouds from a selected subset of keyframes, including those forming the convex hull.

Conventional LiDAR odometry typically computes discrete odometry each time a new LiDAR point cloud is received. In contrast, certain methods aim to model a continuous trajectory, emulating the continuous motion of an actual robot. CT-ICP [31] accomplishes this by interpolating the positions of individual points within the LiDAR scan between the starting and ending poses. Subsequently, a continuous-time odometry estimate is obtained by registering each point through scan-to-map matching.

3.2 Feature-based matching

Feature-based approaches in LiDAR-only odometry extract feature points in the LiDAR point cloud and leverage them to estimate the transformation. Utilizing only feature points instead of the entire point cloud can improve computational speed and overall performance by eliminating outliers such as noise. The main challenge with feature-based methods lies in the selection of ‘good’ feature points that enhance point cloud registration performance.

LOAM [162, 163] identifies points on sharp edges and planar surface patches by assessing local surface smoothness and matching them to estimate the robot’s motion. Subsequent developments within the LOAM framework aim to improve performance by refining feature point selection. LeGO-LOAM [125] utilizes point cloud segmentation to classify points as either ground points or segmented points, ensuring accurate feature extraction. It leverages planar features from ground points and edge features from segmented points to incrementally determine a 6 degree-of-freedom (DOF) transformation. R-LOAM [101] and RO-LOAM [102] optimize the robot’s trajectory by incorporating mesh features derived from the 3D triangular mesh of a reference object with a known global coordinate location.

Plane features, prevalent in everyday environments, have garnered significant attention as they can be easily extracted from the LiDAR point cloud. SuMa [7] employs surface normals for odometry by comparing vertex and normal maps from the current scan with those rendered from a surfel-based map. SuMa++ [24] integrates semantic information from RangeNet++ [95] into the surfel-based map [7] and applies Semantic ICP, adding semantic constraints to the objective function of the ICP algorithm. F-LOAM [139] emphasizes extracting distinctive horizontal features from the point cloud of mechanical LiDAR, where data are sparse vertically and denser horizontally. This approach minimizes the risk of false feature detection in the horizontal plane. Zhou et al. [176] and \(\pi \)-LSAM [177] jointly optimize keyframe poses and plane parameters, referred to as plane adjustment (PA), in indoor environments. MULLS [104] extracts diverse feature points (ground, facade, pillar, beam) and employs scan-to-map multi-metric linear least square ICP (MULLS-ICP). VoxelMap [160] employs adaptive-size, coarse-to-fine voxel construction for robust handling of varying environmental structures and sparse, irregular LiDAR point clouds. It addresses uncertainties from both LiDAR measurement noise and pose estimation error through probabilistic plane representation.

Instead of the variants of the ICP algorithm, the NDT algorithm can be employed, even when using features. NDT-LOAM [22] initially obtains approximate odometry using the weighted NDT (wNDT) algorithm. This initial estimate is then refined by incorporating corner and surface features. E-LOAM [45] extracts geometric and intensity features, enhances these features with local structural information, and estimates odometry with D2D-NDT matching. Wang et al. [141] propose a coarse-to-fine registration metric with NDT and PLICP (point-to-line ICP) [17]. The roughly estimated pose with NDT serves as the initial guess for PLICP, resulting in a more accurate pose estimation.

3.3 Deep learning-based matching

While direct and feature-based methods exhibit effective performance in various environments, they often encounter difficulties with correspondence matching. It is crucial to maintain feature consistency and find the relationship between each scan to address this challenge. Some researchers investigate deep learning approaches, which hold promise in effectively addressing these issues. LO-Net [80] introduces a scan-to-scan LiDAR odometry network that predicts normals, identifies dynamic regions, and incorporates a spatiotemporal geometrical consistency constraint for improved interactions between sequential scans. LodoNet [173] utilizes a process of back-projecting matched keypoint pairs (MKPs) from LiDAR range images into a 3D point cloud. This involves employing an MKPs selection module inspired by PointNet [108], which aids in identifying optimal matches for estimating rotation and translation. Cho et al. [27] exploit unsupervised learning in LiDAR odometry, utilizing VertexNet to quantify point uncertainty and PoseNet to predict relative pose between frames. The network incorporates geometrical information through estimating normal vectors and uses an uncertainty-weighted ICP loss. During supervised training, they address trivial solutions via FOV loss.

4 LiDAR-inertial odometry

Table 2 The overarching summary of LiDAR-inertial odometry

LiDAR-only odometry is computationally efficient without needing additional sensors. However, it cannot fully address the challenges detailed in Sect. 7. Therefore, recent LiDAR odometry commonly integrates LiDAR with IMU. IMU provides angular velocity and linear acceleration measurements, making it suitable for estimating coarse robot motion and enhancing pose estimation accuracy when used with LiDAR. LiDAR-inertial odometry can be branched into two categories based on how LiDAR and IMU data are fused: (1) loosely coupled and (2) tightly coupled.

The loosely coupled method independently estimates the state of each sensor, combines these states with weights, and then determines the robot’s state. This approach offers high flexibility, as it estimates the state of each sensor individually. It facilitates easy adaptation to changes in the sensor system without extensive modifications to the existing framework as long as a suitable odometry module is created for the new sensor modality. Furthermore, it permits assigning weights to specific sensors, ensuring robustness in case one sensor performs sub-optimally, as the odometry can still utilize data from other sensors.

On the other hand, the tightly coupled method utilizes measurements from all sensors concurrently to estimate the robot’s state. This results in potentially more accurate odometry, as it incorporates a greater number of constraints during the odometry estimation process compared to the loosely coupled method. However, this approach comes with a higher computational load, as all observations must be processed together. Additionally, it may be more susceptible to a loss of robustness if one sensor delivers poor-quality observations. A summary of LiDAR-inertial odometry literature is provided in Table 2. In the following subsections, the specifics of these approaches are introduced.

4.1 Loosely coupled approaches

From the existing LiDAR-only methods, advancements were made with the development of LOAM [162, 163] and LeGO-LOAM [125] by incorporating IMU sensor to correct distortions in LiDAR scans and provide initial motion estimates. Building on these improvements, Zhou et al. [175] estimate the coarse pose of the robot using INS and encoder data, refining it with LiDAR odometry via the NDT algorithm. Tang et al. [134] use the extended Kalman filter (EKF) to fuse independent position results from LiDAR and IMU sensors. Similarly, Zhen et al. [172] employ the error-state Kalman filter (ESKF), merging the prior motion model from IMU with LiDAR-derived partial posterior information for improved robustness and accuracy. Additionally, Hening et al. [50] utilize an adaptive EKF in their estimations, incorporating residuals from both INS with GPS and INS with LiDAR, facilitating further result refinement. On another front, Yang et al. [154] opt for pose graph optimization, combining INS and LiDAR scan matching-based estimates for accurate and reliable state estimation. While loosely coupled approaches improve accuracy over LiDAR-only methods and offer modular flexibility, they do not fully harness the synergy between sensors. This has led to increased research into tightly coupled methods, which seek to maximize sensor integration for enhanced performance.

4.2 Tightly coupled approaches

Shifting the focus to tightly coupled methods, this approach offers a distinct perspective on sensor fusion. Contrasting with the loosely coupled techniques, tightly coupled methods process data from multiple sensors in a unified framework. This integrated processing exploits the interdependencies among different sensor modalities, aiming to enhance both the accuracy and robustness of the state estimation process.

This approach begins with Zebedee [12], a pioneering effort in 3D LiDAR-inertial odometry. Zebedee optimizes surface correspondence error and IMU measurement deviations for odometry estimation. Initially, integrating IMU measurements directly into the factor graph posed computational challenges due to the high-frequency output of 6D pose parameters. The advent of the IMU preintegration method [40] addressed this issue by condensing hundreds of IMU measurements between keyframes into a single IMU preintegration factor. This facilitates the inclusion of each sensor measurement in the factor graph, accelerating the development of graph-based LiDAR odometry methods.

Building upon these advancements, further innovations emerged in the field. LIPS [43] constructs a factor graph with continuous IMU preintegration factors and 3D plane factors from LiDAR measurements, solving the graph-based optimization problem to obtain robot odometry. IN2LAMA [76] utilizes upsampled preintegrated measurements (UPMs) [75] from IMU for de-skewing LiDAR scans, formulating a batch on-manifold optimization with LiDAR factor, IMU bias factor, and inter-sensor time-shift factor. Its next version, IN2LAAMA [77], introduces the IMU preintegration factor, similar to their previous work, IN2LAMA, but stands out by using UPMs to precisely de-skew all LiDAR measurements. While this advanced de-skewing process enhances accuracy, it may impact the real-time operation. In LIO-SAM [126], motion estimated through IMU preintegration serves a dual purpose: de-skewing LiDAR scans and introducing a factor into the factor graph. In addition, Ye et al. [155] leverage LiDAR scans and preintegrated IMU measurements for joint optimization with rotational-constrained refinement.

Further advancements in tightly coupled methods have been made, focusing on feature selection and global optimization. KFS-LIO [81] introduces a metric for selecting the most effective subset of LiDAR features, streamlining existing graph-based methods. Li et al. [79] exploit hierarchical pose graph optimization with a novel feature extraction method of scanning solid-state LiDAR, which has an irregular scanning pattern and a metric weighting function for quantifying each LiDAR feature’s residual. Koide et al. [71] leverage GPU-accelerated voxelized Generalized-ICP matching cost factor and IMU preintegration factor. They employ a keyframe-based fixed-lag smoothing technique to estimate low-drift trajectories efficiently and create a factor graph that minimizes global registration errors throughout the map. Additionally, Setterfield et al. [124] directly include feature correspondences from LiDAR measurement into a factor graph.

Unlike prior discrete-time methods, CLINS [94] employs a continuous-time framework utilizing cubic B-splines, allowing trajectory estimation at any given time by optimizing control points and knots. CLINS excels in handling asynchronous data from LiDAR and IMU sensors and managing high dynamic scenarios with small knot distances. This makes it adept at handling point clouds with potential distortions due to different acquisition times. PGO-LIOM [128] introduces a gradient-free optimization algorithm and a fully parallel Monte Carlo sampling approach specifically designed to address challenges posed by nonlinear and non-continuous problems that are difficult to handle with low-power onboard computers. They also integrate acceptance-rejection sampling [39] into feature matching cost, allowing the system to account for correct and incorrect feature matching concurrently. Wildcat [113] integrates asynchronous LiDAR and IMU measurements using continuous-time trajectory representations in a sliding-window fashion. DLIO [21] leverages the hierarchical geometrical observer instead of a filter for performance-guaranteed state estimation. Also, they propose a new coarse-to-fine approach for the continuous trajectory with a constant jerk and angular acceleration model to reduce computational overhead significantly.

As graph-based approaches progress, various factors are integrated into factor graphs to improve odometry performance. However, the increasing computational demands of such methods have led to a growing interest in approaches with lighter computational loads. Consequently, several filter-based approaches, often based on the classical Kalman filter, have emerged. LINS [110] utilizes an iterated error-state Kalman filter (iESKF) for faster odometry estimation compared to graph-based approaches. Despite attempts to enhance computational efficiency, the LINS system still faces challenges with a considerable computational load and slow processing speed, particularly when calculating the Kalman gain due to the substantial number of LiDAR measurements. FAST-LIO [151] successfully addresses this issue by introducing a novel Kalman gain formula. FAST-LIO2 [152] further improves accuracy by eliminating the feature extraction process and directly registering raw LiDAR measurements to the map. They also enhance computation speed with a data structure called an ikd-Tree. Faster-LIO [4] replaces ikd-Tree with incremental voxels (iVox) for faster search. Shi et al. [129] utilize the Invariant EKF to mitigate the linearization errors inherent in EKF-based odometry, which can significantly impact estimation performance. The invariant EKF [6] demonstrates enhanced convergence and consistency compared to the standard EKF, resulting in more reliable results. Additionally, they introduce two novel methodologies: Inv-LIO1 and Inv-LIO2. Inv-LIO1 initially estimates the state through scan-to-scan matching and refines it using a mapping module. In contrast, Inv-LIO2 achieves superior accuracy with increased computation time by performing map-refined odometry through scan-to-map matching and integrating global map updates.

Advancements in graph-based and filter-based approaches have substantially enhanced the reliability of LiDAR-inertial odometry in typical environments. Moreover, methods are now specifically designed to robustly estimate odometry in complex scenarios such as dynamic and degenerative environments. Ding et al. [33] exploit factor graph optimization based on a Bayesian network, considering high dynamic scenarios such as urban areas. RF-LIO [109] begins with an initial pose estimation using IMU preintegration. It utilizes the error between IMU preintegration and scan matching to create a range image and eliminate dynamic points. In addition, RF-LIO employs graph optimization to enhance pose estimation further. Similar with RF-LIO, Hu et al. [57] leverage segmentation-based moving object detection and verification into FAST-LIO2 [152] to handle inaccurate data association in dynamic environments. LIMOT [178] estimates poses of ego vehicle and dynamic target objects with trajectory-based multi-object tracking. By separating the dynamic and static object pose factors, the entire factor graph can simultaneously filter the dynamic objects with pose estimation. Kim et al. [68] propose an adaptive keyframe generation scheme that considers the surrounding environment, enabling higher odometry accuracy in extreme environments.

Furthermore, a variety of constraints and metrics have been developed to refine odometry accuracy further. LION [133] incorporates an observability metric to anticipate potential declines in the quality of estimated odometry. This observability score guides the system’s transition to an alternative odometry algorithm facilitated by a supervisory algorithm like HeRo [120]. LIO-Vehicle [150] takes motion constraints of the ground vehicle to handle geometrically degraded environment by extending 2-DOF vehicle dynamics to preintegrated factor. Zeng et al. [161] propose a feature extraction scheme based on single line depth variation and is specifically designed for the non-uniform sampling point cloud characteristics of scanning solid-state LiDAR. Chen et al. [19] leverage SE(2) constrained pose estimation for ground vehicle to solve non-SE(2) vehicle motion perturbation. Li et al. [78] improve feature extraction by incorporating intensity edge features within geometric planar features. They also employ multi-weighting functions based on residuals and registration consistency to assess the quality of each feature during the pose optimization process. Furthermore, RI-LIO [167] combines two residual types in its state estimation process: photometric errors from reflectivity images and point-to-plane distances from geometric points. These images are generated using the Corrected Projection by Real Angle (CPBRA) method, addressing LiDAR laser projection biases.

Another method to enhance accuracy involves high-frequency odometry, where advancements are made through the development of techniques that improve emotion estimation by segmenting LiDAR scans. LoLa-SLAM [66] achieves low-latency localization with a high temporal update rate by slicing LiDAR scans, ensuring sufficient measurements for accurate matching. This method is crucial for high-frequency odometry as it allows for more frequent and timely updates of the vehicle’s position. On the other hand, FR-LIO [171] deals with an aggressive motion by adaptively dividing LiDAR scan into multiple sub-frames, enhancing estimation robustness. Such division is essential for maintaining accuracy in high-frequency odometry, particularly in dynamic environments. Additionally, Zhao et al. introduce the iterated ESKFS to mitigate potential degeneration issues caused by increased sub-frames. Point-LIO [47] achieves high-frequency odometry through a point-by-point framework. This approach involves processing LiDAR scans at the individual point level, a strategy that naturally eliminates motion distortion. These high-frequency methods offer a path to more responsive and accurate odometry in rapidly changing scenarios.

Similar to LiDAR-only odometry, deep learning methods play a pivotal role in enhancing odometry estimation, showcasing advancements in this domain. Chen et al. [23] integrate factor graph for state estimation and plane-driven submap matching with a learning-based point cloud network for loop detection. Liu [88] exploits the adaptive particle swarm filter with an efficient resampling strategy to tackle the environment diversity integrating with lightweight learning-based loop detection. Liu and Ou [89] propose FG-LC-Net [90] for learning-based loop closure and data structure S-Voxel to improve the speed of the system.

5 Multiple LiDARs

The LiDAR-inertial odometry, discussed in Sect. 4, showcases impressive accuracy. Nevertheless, limited FOV in certain LiDAR systems poses challenges to state estimation, hindering further advancements. Additionally, interference from other sensors can obscure regions within the LiDAR’s FOV. Irregular scanning patterns, observed in some scanning solid-state LiDARs, further pose challenges in achieving precise scan registrations due to sparsity.

To tackle challenges associated with single LiDAR systems, researchers are increasingly exploring the use of multiple LiDARs in odometry. Multiple LiDARs offer broader scanning coverage, reducing interference from additional sensors. Integrating diverse scanning patterns from multiple LiDARs enhances accuracy in scan registrations, surpassing reliance on a single LiDAR with a non-repetitive scanning pattern.

Pioneering research in the domain of multiple LiDARs-based odometry begins with M-LOAM [62]. Assuming the synchronization of all LiDARs, M-LOAM involves feature extraction from each LiDAR, data aggregation, and estimation of the robot’s state. However, synchronizing multiple LiDARs using PPS (Pulse Per Second) introduces complexity and necessitates additional hardware requirements. On the other hand, synchronization through PTP (Precision Time Protocol) primarily aims to unify time standards but may demand extra effort to attain synchronized data. Lin et al. [86] employ a decentralized extended Kalman filter (EKF) that concurrently runs multiple EKF instances, one for each LiDAR. While this method can handle asynchronous LiDARs, it doesn’t fully leverage the combined measurements from all LiDARs simultaneously, which reduces the benefits of using multiple LiDARs.

In the case of independently utilizing measurements from each LiDAR for state estimation, the occlusion experienced by a single LiDAR can have a cascading impact on subsequent state estimation. LOCUS [103], which assumes that all the LiDARs are synchronized, points out that significant time discrepancies can result in failures in state estimation. In their subsequent research [115], they address this challenge by discarding delayed scans to enhance robustness, although this approach comes at the expense of losing some information. Similarly, M-LIO [30] acknowledges the asynchrony among LiDARs through signal association. However, it lacks a method to compensate for the temporal discrepancies arising from the asynchrony.

To overcome these issues, researchers have integrated IMU sensors for correcting temporal discrepancies in asynchronous LiDAR measurements [64, 98, 100, 142], similar to their role in LiDAR-inertial odometry. Nguyen et al. [98] and Wang et al. [142] employ IMU propagation to compensate for temporal discrepancies among multiple LiDARs. They extract edge and planar features from each point cloud and transform these features into a common reference frame aligned with the most recent acquisition time from all LiDARs. While these approaches successfully estimate robot trajectories, they also introduce additional challenges. IMU propagation, which is inherently discrete due to its frequency, requires additional linear interpolation, potentially leading to additional errors. Moreover, as time discrepancies become more pronounced, the duration required to accumulate the point clouds increases, which further intensifies the dependence on IMU for state propagation. However, the accuracy of the IMU propagation deteriorates over extended periods due to noise, which can adversely impact the odometry.

In addressing the challenge of discrete IMU propagation, MA-LIO [64] adopts B-spline interpolation [131] as an alternative for linear interpolation, effectively compensating for temporal discrepancies. Furthermore, Jung et al. [64] leverage point-wise uncertainty to assign penalties based on the acquisition time, addressing the challenge of degraded IMU propagation accuracy. On the other hand, SLICT [100] interprets the point clouds of each LiDAR as a continuous stream. Combining only the point clouds captured within a designated interval, SLICT maintains a consistent accumulation duration, even when significant time discrepancies exist.

Utilizing multiple LiDARs for odometry addresses the limitations associated with single LiDAR configuration, leading to improved performance. However, challenges such as optimizing LiDAR placements [56], increased computational demands, and inherent issues in single LiDAR system persist. Section 7.5 provides an examination of these challenges. Additionally, to enhance robustness, especially in challenging scenarios, researchers have explored the integration of LiDAR with other sensor modalities. The integration and its impact on system performance are discussed in more detail in Sect. 6.

6 Fusion with other sensors

LiDAR demonstrates robustness to changes in lighting conditions, unlike visual sensors; nevertheless, it confronts challenges in demanding environments. Specifically, LiDAR odometry encounters difficulties in obtaining accurate measurements under adverse conditions such as rain, snow, and dust. Moreover, LiDAR measurements are vulnerable in areas with limited geometric features or repetitive topographical attributes, such as long tunnels or highways. This susceptibility contributes to scan matching challenges, negatively affecting state estimation’s precision. Addressing these constraints involves exploring the integration of multiple-sensor modalities, marking a notable frontier in current research.

Fig. 3
figure 3

LiDAR Odometry Pipeline. The common framework for LiDAR odometry can be broadly divided into three stages: preprocessing, initial estimation, and state estimation. When incorporating other sensors, their integration is classified as either loosely coupled or tightly coupled, based on the specific stage at which the additional sensor data are utilized. In the state estimation stage, the refined state is leveraged both in the odometry and mapping

RGB cameras offer distinct advantages over LiDAR sensors, excelling in capturing intricate details through color and texture. This capability becomes crucial in environments where prominent geometric features are scarce. In such scenarios, combining camera images with LiDAR measurements can significantly enhance the reliability of state estimation. Lin et al. [87] propose R\(^2\)live, a tightly coupled LiDAR-visual-inertial odometry system that merges a high-rate filter-based approach with a low-rate graph optimization. The high-rate filter leverages LiDAR, camera, and IMU measurements, while the factor graph optimizes local maps and visual landmarks. LVI-SAM [127] consists of two jointly operating subsystems: the LiDAR-inertial system (LIS) and visual-inertial system (VIS). The estimated pose from each subsystem serves as the initial pose for the other. LIS operates independently only when the number of features in VIS decreases due to aggressive motion or illumination changes, leading to a failure of LIS [164]. Similar to R\(^2\)live, R\(^3\)live [85] also separates the LiDAR-inertial odometry (LIO) and visual-inertial odometry (VIO). LIO reconstructs geometric structures, while VIO reconstructs texture information. The proposed VIO system utilizes RGB-colored point cloud maps to estimate the state, minimizing photometric errors without the need to detect visual features, thus saving processing time. Fast-LIVO [174] enhances efficiency by directly registering point clouds without extracting features. This optimization is achieved by reusing the point clouds from both the LIO and VIO subsystems, resulting in faster operation and improved overall system efficiency. Additionally, LIC-fusion [181, 182] fuses sparse LiDAR features with visual features through a multi-state constraint Kalman filter (MSCKF) along with online multi-sensor calibration. In the context of continuous-time SLAM, there has been a growing interest in continuous-time LiDAR-visual-inertial odometry. An example of such an approach is Coco-LIC [74]. This system adopts a non-uniform B-spline-based continuous-time trajectory representation, seamlessly integrating LiDAR and camera data in a tightly coupled manner.

RGB cameras depend on ambient lighting conditions to capture images, and their performance tends to degrade in low-light or adverse weather conditions. In response to these challenges, thermal cameras operating in the infrared wavelength range have proven effective in visually degraded environments with varying illumination. Rho et al. [116] utilize stereo thermal cameras in conjunction with LiDAR for indoor disaster scenarios. Moreover, radar and event cameras have demonstrated robust performance in challenging environmental conditions. Thermal cameras, radar, and event cameras, when used in conjunction with LiDAR, offer distinct advantages, presenting practical alternatives to address the limitations of RGB cameras. Harnessing these diverse sensor modalities can significantly improve odometry accuracy, as highlighted in [13].

These sensor modalities extend beyond mobile robots or handheld systems and find application in legged robots. Legged robots excel in navigating bumpy terrains and overcoming obstacles like rocks or debris, leveraging their unique ability to step over them. This capability makes legged robots well-suited for tasks such as search and rescue missions, exploration, and disaster response. VILENS [148] utilizes measurements from LiDAR, IMU, cameras, and leg contact information derived from a joint kinematics model. This integrated sensor fusion empowers the system to attain accurate odometry, even in demanding environments.

Integrating multiple sensors for odometry presents practical solutions for addressing diverse environmental conditions. However, this approach comes with computational demands and introduces specific issues associated with each sensor. While sensor fusion can compensate for the limitations of individual sensors, the fusion process itself requires considerable effort. These limitations will be scrutinized further in Sect. 7.5. Indiscriminate sensor fusion may not lead to an optimal odometry solution. Hence, thorough planning and a precise grasp of each sensor’s specific requirements are crucial prerequisites before deploying sensor fusion.

The classifications of LiDAR odometry introduced so far can be organized into a unified pipeline, as shown in Fig. 3. The figure illustrates how additional sensors can be incorporated into a LiDAR odometry system, guiding the determination of sensor usage and data integration strategies.

7 Remaining challenges

Undeniably, LiDAR odometry technologies have witnessed significant advancements in providing high-quality positions for mobile robots and autonomous vehicles, with their performance demonstrated in various real-world environments [36, 165]. However, despite these significant advancements, unresolved issues remain valuable for further research. This section discusses these issues and proposes future directions for LiDAR odometry.

7.1 LiDAR inherent problems

LiDAR, while offering accurate measurements and resilience to lighting conditions in contrast to RGB cameras, is not exempt from inherent limitations. In this subsection, we highlight several constraints of the LiDAR sensor that pose challenges in solving the odometry problem.

Large Data: The LiDAR system generates a voluminous 3D point cloud, containing rich environmental and object data. It offers a significant advantage in capturing 3D information about the surrounding environment; however, there are challenges with its size. The size of this point cloud scales with the LiDAR’s FOV and resolution. For instance, the OS1-128 LiDAR can produce scans containing a substantial number of points, reaching several 100K points per frame, operating at a maximum frequency of 20Hz. Additionally, each point in the point cloud includes information such as range, intensity, reflectivity, ambient conditions, and point acquisition time, contributing to the data volume. Real-time processing of such extensive data requires substantial computational power, posing a particular challenge in robotics, where achieving real-time performance is crucial for effective operation.

When integrating multiple LiDARs or adding extra sensors, the computational load is intensified, potentially impacting real-time performance. Techniques such as downsampling or feature extraction can help alleviate the computational burden, but it is evident that computational costs increase with the number and resolution of the LiDARs. In two studies [98, 100] utilizing the NTU VIRAL dataset [99], which includes two 16-channel LiDARs, the optimization processes took over 100ms—equivalent to the duration of a LiDAR sweep. While this processing time may be acceptable for systems using keyframes, it becomes impractical in scenarios requiring estimations for every scan.

Motion Distortion: When a robot moves at a high speed relative to the sensor’s data acquisition frequency, a substantial spatial gap can occur between the locations where the data was obtained at the beginning and end of a single LiDAR scan. This spatial gap has the potential to introduce significant distortion [3] to the LiDAR scan. Therefore, to effectively utilize LiDAR scans, it is necessary to apply a compensation process to mitigate the distortions caused by motion, commonly referred to as de-skewing.

De-skewing commonly employs high-frequency sensors such as IMU [126, 166] for aligning points to a single frame. Linear interpolation [100, 152] can address its discrete nature and the mismatch between sensor measurements and their actual positions. In the absence of extra sensors, a constant velocity model [54, 110, 162] may suffice but lacks accuracy in aggressive motion or uncertain velocity estimations. Continuous-time interpolation [74, 94, 179], an alternative approach, estimates a continuous trajectory through B-spline interpolation, ensuring accurate transformations for each LiDAR point. However, this method significantly increases computational demands, particularly with more points, as each requires individual state calculation. Thus, balancing accuracy and efficiency is crucial, with the choice depending on the application’s specific needs and constraints.

Limited Sensing: LiDAR, while capable of measuring long distances, presents inherent limitations. One prominent drawback is its relatively narrow FOV, particularly problematic for perception tasks. Additionally, LiDAR data tends to be sparser than images from standard cameras, even though the horizontal FOV is generally wider. Recently, advancements in vertical cavity surface emitting laser (VCSEL) technology have enabled the compact arrangement of numerous lasers in a dense array. Despite this advancement, resulting in sensors with increased channels and denser data, the resolution remains lower compared to conventional cameras. In addition, when employing mechanical LiDAR, installations are often in open areas, such as the top of robots or autonomous vehicles, to achieve 360-degree visibility. However, this poses challenges in protecting the sensor from external shocks. Attempts to install the sensor in more sheltered locations result in a trade-off with the loss of FOV visibility.

7.2 Heterogeneous LiDARs

In Sect. 2.2, we discuss the classification of LiDAR sensors into two categories: mechanical and scanning solid-state LiDARs. These categories exhibit distinct characteristics, including variations in viewing angles, scanning patterns, and more. As a result, these disparities essentially lead to the requirement for different odometry algorithms. Moreover, even within the same category of LiDAR, variations in FOV, resolutions, and other factors exist across different manufacturers and product lines. This implies that an algorithm effective with one type may necessitate adjustments to additional parameters when applied to another. Recognizing the inconvenience of modifying methods based on the specific sensor, there is a growing demand for an algorithm capable of robust operation across all types of LiDAR.

KISS-ICP [137] stands out as a representative approach to addressing these issues. They propose a simplified yet effective LiDAR-only odometry approach that relies on point-to-point ICP, performing comparably with other LiDAR-only methods across various platforms and environmental conditions. Notably, their proposed system is versatile for a broad spectrum of operating conditions using different LiDAR sensors. While KISS-ICP proves to be a simple and versatile solution for various LiDAR sensors, a generalized methodology for LiDAR-inertial odometry and fusing with other sensors is lacking. Consequently, there remains potential for performance improvement in the overall generalized approaches.

7.3 Degenerative environment

Traditional LiDAR odometry primarily depends on geometric measurements, neglecting texture and color information usage. This reliance becomes challenging in feature-scarce and repetitive environments, such as tunnels and long corridors. While LiDAR effectively performs scanning in these settings, the absence of unique features often leads to ambiguity in scan matching, resulting in potential inaccuracies in the pose estimation of robots.

To tackle this challenge, Zhang et al. [164] introduce a mathematical definition of degeneracy factor derived and evaluated using eigenvalues and eigenvectors, enabling more accurate state estimation when a degeneracy is detected. AdaLIO [83] introduces an adaptive parameter setting strategy, advocating for the use of environment-specific parameters to address the degeneracy issue. Their straightforward approach involves pre-defining parameters for general and degenerate scenarios and adjusting them based on the situation. Wang et al. [138] mitigate the uncertainty associated with the corresponding residual and address the degeneration problem by removing eigenvalue elements from the distribution covariance component. Shi et al. [130] propose an adaptive correlative scan matching (CSM) algorithm that dynamically adjusts motion weights based on degeneration descriptors, enabling autonomous adaptation to different environments. This approach aligns the initial pose weight with environmental characteristics, resulting in improved odometry results.

Sensor fusion methods also have shown the potential to address the uncertainty in LiDAR scan matching within degenerative cases. DAMS-LIO [46] estimates LiDAR-inertial odometry utilizing the iterated extended Kalman filter (iEKF). When the system detects degeneration, it employs a sensor fusion strategy, following a loosely coupled approach that integrates odometry results from each sensor.

LiDAR has the potential to overcome degenerative environments without the need for sensor fusion if additional information can be accessed from the measurements beyond the geometric details. Researchers have explored leveraging intensity [78, 106, 140] or reflectivity [35, 167] data from LiDAR measurements to enhance state estimation in degenerate environments. Integrating supplementary texture information with the original geometric data offers a more robust and reliable solution, particularly in challenging scenarios where geometric features alone may not suffice for accurate localization and mapping. Furthermore, by employing FMCW LiDAR to measure Doppler velocity similar to radar, DICP [51] improves the vanilla ICP algorithm with a Doppler velocity objective term, enhancing scan matching performance, especially in feature-scarce environments. Notably, their work forecasts odometry with high accuracy, even in the demanding scenario of a 900-meter-long tunnel sequence. Improving upon DICP, Wu et al. [149]. and Yoon et al. [158] integrate the Doppler velocity factor in a continuous-time odometry framework. These works suggest that the degeneracy problem can be effectively addressed through the use of FMCW LiDAR.

7.4 Degraded environment

A degraded environment is one that presents challenges to the sensing ability of LiDAR, unlike a degenerative environment. LiDAR operates by emitting a laser pulse and detecting its return after interacting with objects, and this process can be disrupted by unwanted particles obstructing the pulse’s path. Extreme weather conditions such as direct sunlight, rain, snow, or fog can significantly degrade LiDAR’s detection performance [11, 132]. Considerable research has been dedicated to denoising weather-induced interferences to address this challenge due to extreme weather. Park et al. [105] propose a Low-Intensity Outlier Removal (LIOR) filter to eliminate snow particles from the LiDAR point cloud. Utilizing a CNN-based approach, WeatherNet [49], a variant of LiLaNet [107], is trained with augmented data incorporating a fog model and a rain model. This training process aims to effectively remove noise caused by adverse weather conditions from the actual LiDAR data. Despite extensive research on weather noise removal algorithms, there is a lack of investigation into the performance of LiDAR odometry using these algorithms. Exploring this area is essential to ensure that LiDAR odometry consistently delivers high-level performance under harsh weather conditions, ensuring the stability of autonomous driving.

Beyond weather conditions, typical objects, such as glass, that partially reflect or transmit laser pulses [41, 144, 170] can adversely affect LiDAR performance. This problem is particularly prevalent in urban or indoor settings with numerous glass windows, where reflections from one side can interfere with the LiDAR points on the opposite side of the glass. This issue can impact odometry performance due to the ambiguity in scan matching. However, there is currently a lack of research on algorithms to address this problem completely.

7.5 Multi-modal sensors

When integrating additional sensors with LiDAR, it is crucial to acknowledge that these supplementary sensors introduce their own set of challenges. Moreover, the combination of multiple sensors can introduce new limitations and complexities. This subsection delves into these additional considerations.

Calibration: When working with multiple sensors, it is essential to conduct both intrinsic calibration for each sensor and extrinsic calibration between the sensors. However, it is crucial to note that this calibration process can be highly challenging and complex despite the availability of calibration tools and methodologies [34, 91, 114]. Precise intrinsic calibration for each sensor and accurate extrinsic calibration between multiple sensors present difficulties involving addressing diverse error sources, considering environmental factors, and managing complex mathematical transformations. The intricacies of calibration can make the process time-consuming and demanding for both researchers. Even with precise calibration tools, calibrating sensors in systems where the system itself cannot impose constraints on each sensor can be problematic. For instance, car-like vehicles often have insufficient constraints for the z-axis, roll, and pitch angles. As a result, the accuracy of these elements may not surpass that achieved through manual measurements.

Placement: Simply adding more sensors without strategic planning may not affect odometry performance. In the case of multiple LiDARs, strategic positioning to complement scanning areas has the potential for accuracy improvement. However, excessive overlap can lead to redundancy, introducing unnecessary data and increasing computational costs, potentially offsetting accuracy gains [64]. Therefore, careful consideration of optimal deployment strategies is crucial. Although Hu et al. [56] discuss effective multi-LiDAR placement strategies; their focus is on object detection rather than odometry research. Hence, dedicated studies in this domain are needed. This challenge also extends to multi-modal sensor fusion. Similar to the placement considerations for multiple LiDARs, the configuration of each sensor is crucial in system design. Different sensors serve unique roles with diverse recognition capabilities. To maximize the strengths of different sensors, careful consideration is essential in determining whether each sensor’s FOV should overlap.

Synchronization: Integrating different sensor modalities necessitates addressing asynchronous scenarios, as each sensor delivers data in distinct frequencies. While some studies adeptly fuse heterogeneous LiDAR data in discrete-time [98] or continuous-time [64] using IMU, there is a relatively limited body of work on the integration of various sensor modalities. Exploring comprehensive approaches to harness the capabilities of different sensor modalities holds significant potential.

8 Datasets and evaluation

Table 3 The LiDAR-based odometry-related datasets

Ensuring the generalization of LiDAR odometry remains a fundamental goal in its advancement, as elaborated in Sect. 7. As autonomous systems navigate diverse and dynamic environments, algorithms must exhibit consistent performance, irrespective of variations in data quality. Consequently, the significance of comprehensive datasets spanning various environments and sensor modalities cannot be overstated in the development of such algorithms. Diverse data enhance robustness, reducing the risk of overfitting and expanding the versatility of techniques. Simultaneously, establishing standardized evaluation methodologies is crucial to ensure consistent and comparable results across diverse research endeavors. With the growing role of LiDAR odometry in robotics, there is an increased emphasis on creating diverse datasets and refining assessment protocols. These strategic initiatives are essential for effectively addressing various operational challenges.

8.1 Public datasets

Various LiDAR datasets have contributed significantly to odometry research, each with unique features and limitations. In this section, we will present public LiDAR datasets along with their respective characteristics. Public LiDAR datasets are summarized in Table 3.

The KITTI dataset [42], which captures the urban environments using the HDL-64E spinning LiDAR, stands as a renowned resource in the LiDAR community. Its overlapping sequences within and between sessions facilitate precise odometry evaluation, contributing significantly to LiDAR odometry advancements. The NCLT dataset [16], collected over a year using a Segway-based system, and the MulRan dataset [69], spanning around a month, offer spatial diversity from campuses to cityscapes. The Boreas dataset [14], collected over a year on cityscape routes, captures seasonal changes and harsh weather conditions such as rain and heavy snow. While they are invaluable resources for developing odometry algorithms tailored to a specific LiDAR type, these datasets are limited in terms of LiDAR hardware diversity, predominantly relying on a single type of LiDAR. This poses a challenge for algorithms aiming to achieve broader hardware compatibility.

The Complex Urban dataset [61], Oxford Radar Robotcar dataset [5], and EU Long-term dataset [153] distinguish themselves from previous datasets as they utilize multiple LiDARs. The Complex Urban dataset captures data across various urban environments, while the Oxford Radar Robotcar dataset focuses on data collection from a single location for consistency. The EU Long-term dataset, spanning data acquisition over two locations for approximately a year, showcases diverse weather conditions. Despite using multiple LiDARs in these datasets, challenges in generalization arise due to their consistent use of homogeneous LiDAR configurations and a focus on structured environments. This raises concerns about the performance of LiDAR odometry in diverse settings.

The Ford AV dataset [2] addresses the previously mentioned limitation by ensuring location diversity. It captures seasonal variations and various driving scenarios, encompassing freeways, residential areas, tunnels, and vegetation-rich zones, utilizing four HDL-32E LiDARs. Nevertheless, the uniform configuration of the LiDARs still poses a challenge. In contrast, LIBRE [15] provides a driving dataset along with a separate distance error report for 12 LiDARs, detailing performance under diverse weather conditions. It is essential to note that each sequence in LIBRE features only a single LiDAR. Moreover, the dataset does not provide insights into LiDAR odometry on platforms with aggressive motions since it only involves stationary LiDAR-equipped vehicles in artificially controlled weather conditions.

The previously mentioned LiDAR datasets collected with mapping-car systems have limitations in roll and pitch angle variations. To address this, the NTU Viral dataset [99] introduces a new challenge by deploying LiDAR on an unmanned aerial vehicle (UAV). Similarly, the Hilti-Oxford dataset [165], ConSLAM dataset [135], and Wild Places dataset [70] present this challenge using a handheld system in construction site and forest. The Hilti-Oxford dataset further diversifies the landscape by including data from indoor environments, while Wild Places ventures into forest terrains, adding complexity to the dataset landscape. Additionally, the Pohang Canal dataset [29] captures canal environments using a ship-based system.

Recent LiDAR datasets have introduced new dimensions to research by incorporating multiple heterogeneous LiDARs. For instance, the UrbanNav dataset [55] features three mechanical LiDARs navigating urban landscapes, presenting challenges due to asynchronous multiple LiDARs. The Tiers dataset [112] employs a combination of three mechanical and three scanning solid-state LiDARs, capturing distinct measurements from identical locations and offering a unique perspective. On a larger scale, the HeLiPR dataset [65] includes a variety of structured environments and introduces FMCW LiDAR, providing the opportunity to utilize velocity information for LiDAR odometry.

Various LiDAR odometry datasets with unique strengths and limitations have been released and continue to emerge. This highlights the importance of recognizing that no single dataset can offer universal comprehensiveness. Thus, the thoughtful selection of datasets aligned with their specific characteristics remains essential for the development of robust and adaptable LiDAR odometry solutions.

8.2 Evaluation

Evaluation serves as a cornerstone in advancing LiDAR odometry. Comprehensive and consistent evaluation methods are essential, as they enable the measurement of progress, identification of weaknesses, and guidance for future research. Assessing LiDAR odometry algorithms is crucial for establishing their dependability and accuracy and fostering comparability across various approaches. Ultimately, this promotes continuous improvement in LiDAR odometry.

Fig. 4
figure 4

Various Evaluation Methods of LiDAR Odometry. This figure illustrates diverse assessment methods for LiDAR odometry. a Trajectory Error shows local and global discrepancies along the estimated path. b Start-to-End Error highlights long-term drifts from start to finish. c GCP-based Error assesses alignment with GCPs for real-world accuracy. d Entropy-based Error reflects scan registration quality and overall system reliability

8.2.1 Ground truth generation

The cornerstone of the evaluation process is the ground truth, serving as the reference to assess the precision and reliability of odometry estimation. Various methods can be employed to reliably evaluate LiDAR odometry to obtain ground truth, each with unique strengths and potential limitations.

One approach utilizes GPS, providing precise global position measurements. When combined with Real-Time Kinematic (RTK) compensation, GPS can attain centimeter-level precision. Integration of GPS with an IMU enables the derivation of a complete 6-DOF pose, encompassing both position and orientation. Additionally, the integration of Inertial Navigation System (INS) further improves continuous pose estimation, particularly in environments with weak or lost GPS signals.

Another method involves leveraging SLAM technology. The trajectory generated by SLAM, utilizing sensors such as LiDAR, camera, and encoder, can serve as an additional reference for ground truth, especially in environments where GPS signals are unavailable. Combining the strengths of both GPS and SLAM can create a robust system that offers high accuracy and resilience to environmental challenges.

A third approach entails employing tracking systems. These specialized systems, typically optical, utilize multiple cameras [112] or sensors [99] to meticulously track markers or objects within a designated area. They prove especially valuable in environments with low SLAM accuracy or where GPS signals are unavailable. Due to their firmly established precision in both temporal and spatial dimensions, tracking systems become a reliable reference for ground truth in controlled setups.

The Ground Control Points (GCP) method constitutes the fourth approach. This method utilizes specific ground points with known and precise geographical locations, often established using total stations. These GCPs are frequently employed to guarantee accurate positioning and alignment. By comparing sensor data with these reference points, any discrepancies can be identified and corrected, ensuring high measurement accuracy.

Finally, Terrestrial Laser Scanning (TLS) is utilized to establish ground truth. As a variant of LiDAR, TLS swiftly scans and captures 3D data of the environment. Due to its extensive reach and high-resolution data, TLS-based ground truth serves as a benchmark for aligning individual scans. The alignment of these scans to the TLS-derived ground truth enables the determination of the robot’s 6-DOF state, which then serves as the definitive reference for LiDAR odometry.

8.2.2 Evaluation methods

In the evaluation of LiDAR odometry, several quantitative metrics are pivotal for assessing the accuracy and effectiveness of algorithms. When compared to reliable ground truth, these metrics offer insights into the precision, stability, and areas for potential improvement of a particular odometry system. This section will explore some essential evaluation methods (Fig. 4).

Initially, we consider the Absolute Trajectory Error (ATE). ATE provides a comprehensive perspective on the overall odometry consistency. It computes the average deviation between corresponding poses in the estimated trajectory relative to the ground truth, thereby capturing discrepancies throughout the trajectory. Mathematically, it is expressed as:

$$\begin{aligned} \textrm{ATE} = \sqrt{\frac{1}{N} \sum _{i=1}^{N} || p_{i, \text {est}} - p_{i, \text {gt}} ||^2} \end{aligned}$$
(1)

where \( p_{i, \text {est}} \) represents the estimated pose, \( p_{i, \text {gt}} \) the ground truth pose, and \( N \) the total number of poses.

Next, our focus shifts to the Relative Trajectory Error (RTE). Unlike the broad scope of ATE, RTE concentrates on shorter segments of the trajectory. It evaluates the local consistency and accuracy of the odometry, which is particularly crucial for applications that require precision over shorter distances. The formulation of RTE can be represented as:

$$\begin{aligned} \textrm{RTE} = \sqrt{\frac{1}{M} \sum _{j=1}^{M} || q_{j, \text {est}} - q_{j, \text {gt}} ||^2} \end{aligned}$$
(2)

where \( q_{j, \text {est}} \) and \( q_{j, \text {gt}} \), respectively, denote estimated and ground truth relative poses over a defined segment, with \( M \) being the number of such segments. The ATE and RTE are typically calculated using the RPG [168] and EVO evaluator [44].

The Start-to-End Error proves particularly insightful in assessing the long-term consistency and reliability of the odometry result. This metric evaluates the misalignment between the initial and final points of trajectories, offering a macroscopic perspective on odometry performance. Notably, as precisely locating the exact start and end points can be challenging, the error is determined by computing the relative translation between these points using registration methods such as ICP and Generalized-ICP. It is formulated as:

$$\begin{aligned} \text {Error} = \left| \Delta \textbf{p}_{\text {est}} - \Delta \textbf{p}_{\text {gt}} \right| \end{aligned}$$
(3)

where \(\Delta \textbf{p}_{\text {est}}\) and \(\Delta \textbf{p}_{\text {gt}}\) are the position differences calculated from odometry and registration method. It is particularly effective when facing challenges in obtaining a reliable ground truth for the trajectory. Such challenges may arise in indoor environments where obtaining GPS measurements is problematic or in unstructured terrains where the accuracy of SLAM is compromised.

Another approach utilizes GCPs, predetermined precise ground locations typically established with total stations. To conduct an evaluation using GCPs, the estimated trajectory undergoes alignment with these control points using SE(3) Umeyama alignment [136]. Following alignment, the absolute distance error for each GCP is calculated to gauge its deviation from the predicted trajectory. This method hinges on the precision of GCPs to assess the accuracy of the odometry system.

Lastly, certain methods assess the registration quality between consecutive scans. Given that the trajectory derived from LiDAR odometry depends on successful registration, evaluating this aspect can indirectly provide insights into odometry accuracy. The concept of entropy [1] serves as a valuable tool for such evaluations. When two point clouds are accurately registered, the merged point cloud retains entropy similar to that of the original individual point clouds. In contrast, poor registration leads to higher entropy in the combined point cloud. This demonstrates that appropriately registered point clouds maintain consistent entropy, or uncertainty, in their combined form, making it a valuable metric for evaluating registration quality.

Each evaluation method for LiDAR odometry offers distinct insights. Researchers must choose the most suitable validation approach based on their specific experimental context. Continuous advancements and the introduction of innovative comparison methodologies have the potential to enhance the comprehensive evaluation of robustness and accuracy over time.

Table 4 Benchmark results (up: LO, down: LIO)
Fig. 5
figure 5

Estimated Trajectories. a and b illustrate results from the HeLiPR and ConSLAM datasets, respectively. In (b), the red box zooms in on instances where the LiDAR odometry deviates from the ground truth compared to LiDAR-inertial odometry

8.3 Benchmark results

Based on the aforementioned datasets and evaluation methods, we conduct benchmarks to compare the performance of LiDAR-only odometry and LiDAR-inertial odometry. Our benchmark test analyzes the performance of six LiDAR-only odometry and six LiDAR-inertial odometry literature. For LiDAR-only odometry, the selected methods are LOAM [162], LeGO-LOAM [125], KISS-ICP [137], CT-ICP [31], and DLO [20]. In the case of LiDAR-inertial odometry, our focus is on LIO-SAM [126], FAST-LIO2 [152], VoxelMap [160], DLIO [21], and Point-LIO [47].

The evaluation of LiDAR-only and LiDAR-inertial odometry works, as shown in Table 4, has been performed on sequence02 from the ConSLAM dataset [135], eee03 from the NTU VIRAL dataset [99], and Roundabout02 from the HeLiPR dataset [65]. We select these three datasets for evaluation due to their distinct characteristics. The ConSLAM dataset captures brief sequences from a construction site using a handheld system, the NTU VIRAL dataset acquires short sequences from a campus via a drone, and the HeLiPR dataset utilizes a car for large-scale data acquisition at the city level. It is essential to emphasize the variations in both systems and environments used for data acquisition across these datasets.

We assess method performance by measuring the ATE in meters. The NTU VIRAL dataset employs its dedicated evaluation tool for measurements, while for other datasets, we use EVO [44], a widely recognized tool in the field.

As evidenced in Table 4, LiDAR-inertial odometry generally demonstrates enhanced robustness compared to LiDAR-only odometry. However, it is important to note that not all LiDAR-inertial systems outperform LiDAR-only systems, particularly within the HeLiPR dataset. The sequence from the HeLiPR dataset, being exceptionally long, is susceptible to cumulative errors as indicated in Fig. 5a. In such cases, integrating an IMU with LiDAR may not significantly outperform LiDAR-only odometry due to potential error accumulation after large drift. This highlights the necessity of integrating error-resolving mechanisms such as GPS or loop closure in prolonged robot operations to improve odometry performance.

On the other hand, fusion with an IMU can enhance accuracy for shorter paths. The notable advantage of LiDAR-inertial odometry lies in its effective handling of aggressive motions, especially sudden rotations. This becomes particularly evident in scenarios with dynamic motion, such as those involving handheld systems or drones in datasets like ConSLAM or NTU VIRAL. In ConSLAM, although some LiDAR-only and LiDAR-inertial methods may exhibit similar paths, a closer examination reveals that LiDAR-only odometry lacks precision in detailed path estimation. It deviates more significantly from the ground truth compared to LiDAR-inertial odometry, as depicted in Fig. 5b.

In summary, while LiDAR-inertial odometry generally surpasses LiDAR-only systems in robustness, it does not correctly estimate in all scenarios, especially in long sequences prone to cumulative errors. In contrast, for shorter, dynamic paths, the fusion with an IMU offers clear advantages in accuracy and handling aggressive motions. This underscores the importance of context-specific system selection and the integration of corrective mechanisms for optimal odometry performance.

9 Conclusion

This paper emphasizes the crucial role of LiDAR odometry in robotics, underlining its profound influence on perception and navigation. Our survey covers almost all recent LiDAR odometry advancements, delineating their strengths and weaknesses. The versatility of LiDAR odometry is evident, especially in environments with unreliable GPS, making it essential for robotic navigation and mapping. Furthermore, this paper addresses remaining challenges in LiDAR odometry, discusses potential improvements and future directions in the field, and introduces a variety of datasets and evaluation metrics.

While a wealth of LiDAR odometry literature is available, unfortunately, there is no one-size-fits-all solution. LiDAR odometry involves a trade-off between resources and performance, requiring users to carefully consider these factors based on their specific application requirements and available resources. For low computational, especially with low-power single-board computers, a LiDAR-only approach may be optimal in well-defined environments. Integrating an IMU in a loosely coupled fashion can enhance results without significantly increasing computational demands. A tightly coupled multiple-sensor approach is advisable for applications demanding high accuracy across various environments. Combining LiDAR with an IMU is a balanced choice in general situations. Utilizing multiple LiDAR systems may be beneficial to address the narrow FOV issue. Incorporating a camera can be advantageous in texture-limited scenarios. Those with greater computational resources can explore advanced capabilities offered by deep learning-based LiDAR odometry.

We anticipate the ongoing expansion of LiDAR odometry and believe that resolving the challenges through deep learning and multi-modal sensor fusion will pave the way for a general solution. Furthermore, we expect that the continuous development of both LiDAR sensors and odometry algorithms will lead to the emergence of even more accurate and robust odometry solutions in the future.