UAV low-altitude obstacle detection based on the fusion of LiDAR and camera

In this paper, aiming at the flying scene of the small unmanned aerial vehicle (UAV) in the low-altitude suburban environment, we choose the sensor configuration scheme of LiDAR and visible light camera, and design the static and dynamic obstacle detection algorithms based on sensor fusion. For static obstacles such as power lines and buildings in the low-altitude environment, the way that image-assisted verification of point clouds is used to fuse the contour information of the images and the depth information of the point clouds to obtain the location and size of static obstacles. For unknown dynamic obstacles such as rotary-wing UAVs, the IMM-UKF algorithm is designed to fuse the distance measurement information of point clouds and the high precision angle measurement information of image to achieve accurate estimation of the location and velocity of the dynamic obstacles. We build an experimental platform to verify the effectiveness of the obstacle detection algorithm in actual scenes and evaluate the relevant performance indexes.


Introduction
In recent years, UAVs have been used in various military and civilian fields, including reconnaissance, tracking and positioning, power inspection, and agricultural plant protection, etc. The wide application of UAVs puts forward more demands on airspace resources. There are many mission requirements for UAVs in low-altitude flight scenarios. They need to sense and avoid (SAA) various obstacles in the flying environment, including fixed obstacles such as buildings, trees, power lines, and moving obstacles such as birds and other aircraft. Flight safety is the prerequisite for the completion of the mission, and the development of UAV's SAA system is an indispensable aspect for UAVs to move toward largescale applications [1].
When UAVs fly in the low-altitude suburban environment, the environment is relatively complex. As the flight altitude decreases, the density of obstacles in the environment gradually increases. Small obstacles such as power lines increase the difficulty of target detection. The UAV's SAA system not only needs to determine the presence or absence of obstacles, but also needs to obtain the distance and scale information of the obstacles in airspace, so as to generate a safe avoidance path to avoid secondary collision avoidance. It's difficult but important for small UAVs to determine the accurate distribution of power lines and the moving obstacles in airspace. Small UAVs can only carry low-cost, lowquality, low-power sensors, such as visible light camera [2,3], infrared camera [4], depth camera [5], millimeter wave radar [6], and LiDAR [7], due to their limited load capacity. Visual sensors such as visible light camera and infrared camera have strong image acquisition ability and have the advantages of small size, light weight, and low power consumption.
However, these sensors cannot directly obtain the depth information of the target, and the depth estimation based on vision often brings the additional cost of complex maneuvering. The binocular vision [8] can obtain the distance of obstacles through stereo vision matching, but the image feature matching process requires higher calculation ability and poor real-time performance. Depth cameras do not require complicated calculations and can directly obtain image depth through the hardware design of the sensor. But depth detection distance range is short, and doesn't exceed 20 m. Millimeter wave radars can obtain the target's distance, azimuth and velocity information, and have strong ability to penetrate fog, smoke and dust. They have the advantages of high accuracy of ranging and velocity measurement, good environmental applicability and so on, but their angle measurement accuracy is low. LiDAR has its unique advantages in detecting obstacles due to its precise distance and direction measurement capabilities. Traditional three-dimensional LiDAR is less widely used due to their heavy weight and sparse longitudinal point cloud density. With the rapid development of LiDAR technology, a new type of 3D LiDAR using nonrepetitive scanning technology makes up for the shortcomings of sparse longitudinal point cloud density, making it possible to apply to small UAV. However, the single sensor configuration schemes have limitations in detection range, detection accuracy, and anti-interference, etc. They are difficult to detect obstacles stably and accurately in the complex flight environment of future airspace integration. UAV platforms can effectively use multiple sensors to obtain multi-sources information by carrying multiple types of sensors [9,10]. The LiDAR can directly obtain the spatial position and partial contour information of obstacles. The fusion of LiDAR with visible light camera containing rich contour information for obstacle detection has great application prospects. LiDAR and visible light camera are heterogeneous sensors, and generally use decision-based information fusion algorithms to detect obstacles. They are mainly divided into two forms: image-assisted verification of LiDAR point clouds and cross-validation of image and point clouds. The method of image-assisted verification of point clouds first uses point clouds to detect obstacles, and project the point clouds detection results into the image to determine the region of interest (ROI). Then, obstacle detection is performed on the region of interest, and the detection result of the point clouds is verified or supplemented according to the detection result. In [11], the authors use the point clouds to determine the ROI in the image and uses the cascade classifier to identify the target in the ROI area, which uses the way that image to verify the point clouds to improve the detection efficiency. In [12], the authors use the way of vision and LiDAR cross-validation that extract target features from point clouds and image respectively to train the classifier. The realization of information complementation through sensor information fusion technology is an effective way to improve the perception and robustness of UAV SAA system. According to the current researches and the obstacle avoidance requirements of small UAV, our contributions in this paper are described as follows.
1. In this paper, we propose a fusion scheme of LiDAR and visible light camera to detect obstacles for small UAVs. A calibration method is introduced in this paper. 2. In this article, we use image-assisted verification of point clouds to fuse the contour and angle information of image and the depth information of point clouds. In the cross-validation method of vision and LiDAR, the obstacle detection in the point clouds and image is performed at the same time. We design a fusion detection method for variety of static obstacles and a state estimation method for dynamic obstacles by using the LiDAR and visible light camera.
The result of point clouds detection is projected into the image and connected with the detection result of image. The fusion of LiDAR and image information can obtain the contour information of obstacles, and accurately obtain the spatial distribution information of obstacles such as power lines and buildings in low-altitude environments, which can provide accurate environmental sense results for subsequent obstacle avoidance. Compared with ground scenes, the fusion of LiDAR and visible light cameras to solve the obstacle detection problem of small UAVs faces many challenges, including the limited load and computing capabilities of UAV platforms, small obstacles such as power lines in the environment, more flexible dynamic obstacles, etc.

The obstacle fusion detection framework
In this section, we first briefly introduce the obstacle fusion detection framework. The low-altitude obstacle detection framework for small UAVs based on the fusion of LiDAR and visible light camera is shown in Fig. 1. The framework first performs offline calibration on the LiDAR and visible light camera, then selects the appropriate point clouds segmentation algorithm, at last designs fusion detection methods for linear static obstacles, surface static obstacles and dynamic obstacles respectively. In this paper, a three-dimensional non-repetitive scanning LiDAR whose six lasers scan at the same time and the scanning path will not be repeated is selected, which avoids the shortcomings of traditional mechanical rotating structure LiDAR in the vertical direction. Data fusion of the two sensors requires sensor calibration firstly. The purpose of the camera calibration is to obtain the internal parameter matrix and distortion coefficient of the visible light camera, which are used to correct the distortion of the original image. The joint calibration of LiDAR and camera first obtains the three-dimensional coordinates of the corner points of the calibration plate in the point clouds and the pixel coordinates in the image, and then fits the coordinate conversion matrix between the lidar coordinate system and the image pixel coordinate system. The relative position of LiDAR and visible camera and the calibration plate used are shown in Fig. 2.
The conversion relationship between the LiDAR and the pixel coordinate system of the distortion corrected image is defined as follows.
where Rl and tl are the rotation matrix and translation vector from LiDAR coordinate system to camera coordinate system respectively. The calibration results are shown in Fig. 2. In the figure, the point cloud is colored according to the distance. With the increase of distance, the calibration error gradually increases, and the error of joint calibration within 100 m is within 10 pixels. Point clouds are a series of disorderly discrete points in space, which are usually incomplete and sparsely distributed. The process of point clouds segmentation is to assign discrete point clouds to obstacle identification, and classify the point clouds of the same obstacle into one classification, so as to obtain the spatial distribution information of the obstacles contained in the point clouds. The way of non-repetitive LiDAR scanning allows the same obstacle to be scanned repeatedly and the position of each scan is different, so the generated point clouds are more disorderly than traditional LiDAR. When small UAVs fly in the low-altitude environment, the point clouds data is sparser compared to the ground environment. In view of the above characteristics, we first use the KD tree [13] to store the original unordered point clouds, and then obtain the neighboring points through range search based on the KD tree. Point clouds segmentation is performed by the method of neighboring point clouds growth based on adaptive Euclidean distance.

Obstacles detection method
Obstacles in the low-altitude environment for small UAVs can be divided into static obstacles and moving obstacles. Static obstacles also can be divided into linear obstacles and surface obstacles according to the different scanning area. In this section, we introduce the detection methods of various obstacles.

Linear static obstacle detection
Linear static obstacles are mainly represented by power lines, which are difficult to detect due to their small scanning area and size. The sparseness of point clouds from LiDAR causes no point cloud distribution in some parts of the same power line, which leads to oversegmentation that divides the same power line into different obstacles. Moreover, the power lines in some areas will be missed. It is difficult to truly describe the distribution information of the power lines in the environment, which brings additional risks for the UAV to evade maneuvers. The fusion of image that contains the complete outline information of the power line and point clouds can make up for the lack of single LiDAR to detect power lines.
Point clouds segmentation divides the original point clouds into different point cloud clusters, which can

Surface static obstacle detection
For surface static obstacles such as buildings and mountains, the scanning cross-sectional area is larger, so the detection difficulty is lower than that of small obstacles such as power lines. However, UAVs need longer response time and longer maneuver distance to avoid buildings. Obtaining accurate building distance and size information helps UAVs to reasonably plan collision avoidance paths and reduce risk of collision avoid. When the building is far away, the sparse point cloud will be over-segmented to varying degrees along with the irregular protrusions or depressions of the building, and some areas may be missed. As the detection distance increases, the accuracy of the building profile information becomes lower.
The image contains rich outline information. Similar to the detection of linear obstacles such as power lines, the fusion of image contour information and point clouds information can be used to make up for the shortcomings of only point clouds for obstacle detection. In the image, the edge contour map of the obstacle can be obtained through edge detection. We use the fast edge detection algorithm based on structured forest [15] to obtain image contours. The algorithm mainly learns and calculates effective edge detectors through the structure existing in the local image block, and uses the random decision forest algorithm to solve the problem of predicting the local edge mask in the structured learning. Using the BSD500 (https://www2.eecs.berkeley.edu/ Research/Projects/CS/vision/grouping/resources.html) image segmentation data set, the structured forest classifier is trained offline to obtain a general edge detection classifier, and the trained structured forest classifier is used to perform online edge detection on the image. Structured forest is processed in parallel for different pixel positions in the image and calculations between decision trees, so the calculation efficiency is high and the running speed is fast. For different types of point clouds if the two types of point clouds partially overlap in the obstacle area in the image, it means that the two types of point clouds may belong to the same obstacle.
The point clouds are projected to the edge contour map to determine the region of interest firstly, and reliable contour lines are detected outward based on the boundary points of the point clouds projection, which can determine the contour information of the obstacle corresponding to the point clouds data. The detection result of the fusion information is closer to the real contour than the point clouds data. For different classifications of point clouds data, if the two classification of point clouds partially overlap in the obstacle area in the image, it means that the two classification of point clouds may belong to the same obstacle. The adaptive distance threshold of point clouds segmentation is appropriately increased to eliminate the influence of obstacle structure and point cloud sparsity. The two classification of point clouds are re-segmented according to the new threshold to determine whether they belong to the same classification, and the point clouds belonging to the same classification are merged to eliminate the impact of over-segmentation.

Dynamic obstacle detection
Dynamic obstacles represented by aerial vehicles would introduce motion distortion in the LiDAR point clouds data, which would reduce the accuracy of point clouds detection. The characteristics of the instantaneous imaging of the visible light image and the high-precision angle measurement information it contains help to increase the accuracy of the estimation of the motion state of the aerial vehicle. This section is based on the Interactive multi-model Unscented Kalman filter (IMM-UKF) algorithm to fuse the ranging information of the point clouds and the high-precision angle measurement information of the image to realize the state estimation of the aerial vehicle. As shown in Fig. 3, the target state and covariance matrix are initialized with LiDAR data, and then the next state is predicted and the image or LiDAR measurement values are read in time series. If the measurement data of the next frame is point clouds data, it is calculated and updated in the LiDAR filter according to the LiDAR measurement equation. If the new measurement data is image data, the distance information of the predicted value of the LiDAR filter at the time point of the image data and the angle information of the image are combined as the measurement data and input into the fusion filter for calculation and update. The image data only updated the angle information.
In the interactive multi-model algorithm, three models of CV, CA and CSCT are used to describe the movement of the UAV in space. The raw data of LiDAR is the distance, azimuth and elevation angle in the LiDAR polar coordinate system. The measurement equation for LiDAR is where [R, t] is the rotation and translation matrix from the camera coordinate system to the LiDAR coordinate system. The visible light image contains high-precision angle measurement information. According to the pixel position of the target centroid in the image and the internal parameter matrix of the image, the azimuth θ and elevation angle φ information of the target in the camera coordinate system can be calculated according to Eq. 3.
where (u, v) is the pixel coordinates of the centroid, (u 0 , v 0 ) is the optical center coordinates, and f is the focal length. The measurement equation of the camera sensor is The observation matrices h i (k) of the two sensors are both nonlinear, so the filter module in the interactive multi-model adopts the unscented Kalman filter algorithm for state filtering.

Experimental results
In this section, in order to verify the effectiveness of obstacle detection related algorithms based on the fusion of LiDAR and visible light camera, we use LiDAR and visible light camera on the UAV platform to collect obstacle data from power line avoidance scenes, building avoidance scenes, and encounter scenes with lowaltitude small UAVs in low-altitude suburban environments. The experimental platform is shown in Fig. 4.
The purpose of the power line avoidance scene is to verify the perception and positioning capabilities of the sensing system for low-altitude linear static obstacles. Compared with the detection result of single LiDAR data, the power line detection of fused image and point clouds data avoids over-segmentation of power lines and missed detection of some areas. The comparison of the  two detection results is shown in Fig. 5. In the first scene, the inconsistency of the point clouds lead to the over-segmentation of the power line when only point clouds data is used, and the over-segmentation is corrected by fusion data detection based on the linear detection of power lines. In the second scene, since the point clouds scanned on the power line during the sampling period are less when the distance is far apart, relying on the point clouds data for power line detection would cause most of the power lines in the field of view to miss detection. According to the set of power line segments obtained by image detection, the distribution information of the power line in the starting and ending azimuths can be estimated. Compared with single point clouds data, it can be closer to the actual location of the power line more realistically. The effective detection ratio named length ratio (LR) and over segmentation rate (OSR) of the power line pixel length in the field of view are used as evaluation indexes to evaluate the effect of power line detection.
LR ¼ where L d (i), L t (i) are the detected length and true length of the i-th power line respectively, N d andN t are the detected number and true number of power lines respectively. According to 250 frames registered of power line point clouds and image data at different distances and angles collected in the actual scene, manual calibration is used to label the number and length of power lines in the image for statistical analysis with experimental results. The average value of each quantitative index of the LiDAR and two sensors fusion for power lines detection is shown in Table 1. The detection result based on fusion can effectively reduce the over-segmentation of the power line and the average effective detection length is 1.94 times that of single LiDAR data, which verifies the effectiveness of the fusion algorithm. According to the actual environment test, under the currently selected sensor parameters of the LiDAR and visible light camera, the farthest detection distance for the power line is 24 m when the sensing subsystem can ensure that the distance and direction of the power line is obtained.
The main purpose of the building avoidance scene is to verify the perception and positioning capabilities of the sensing system to surface static obstacles at low altitude. In the three-dimensional camera coordinate system, cube model is used to describe the obstacle, and the projection in the corresponding two-dimensional image is a rectangle. The building detection results under different distances and situations are shown in Fig. 6. The left side is the point clouds data detection results, and the right side is the corresponding fusion detection results. The figure shows the detection results at different distances and tilt angles during the movement of the UAV. Obstacle detection based on fusion can effectively reduce over-segmentation and describe obstacle size information more accurately.
The over-segmentation rate and Intersection over Union (IOU) that refers to the overlap rate between the   Table 2. It can be seen that the detection method based on the fusion can effectively correct the over-segmentation problem of the point clouds and improve the IoU value. Because of the large scanning cross-section of the building, the point clouds data can detect the building in a longer range. The detection distance of the building based on the fusion of LiDAR and visible light image can reach 200 m. The two-aircraft encounter scene is designed to verify the state estimation capability of the configuration scheme based on the fusion of LiDAR and visible light camera for low-altitude unknown dynamic obstacles as shown in Fig. 7. The Real-time kinematic (RTK) module is installed on both the UAV and the invading UAV to obtain the relative position information of UAVs. We use the RTK module positioning data with higher accuracy as the reference value of the relative position to evaluate the accuracy of the state estimation. In the image, the YOLOv4 target detection algorithm is used to obtain the position of the intruder in the image. Figure 8 shows the movement situation of the intruder when it moves in opposite directions after flying horizontally. The own UAV is at the origin of coordinates and the direction of the machine head points to the Z axis. The relative speed of the opposite movement is greater and the movement is more intense. The error curve in this situation is shown in Fig 9. It can be seen from the distance and speed average errors of the three coordinate axes that the state estimation results based on sensor fusion have higher accuracy. The highprecision angle measurement information of the image can reduce the influence of motion distortion on the LiDAR data.  Table 3 Accuracy of state estimation for two configuration schemes in multiple motion modes LiDAR LiDAR+camera Based on the results of multiple flight experiments, the state estimation accuracy of the two configuration schemes under various motion forms is shown in Table 3. The typical evaluation indexes selected are: relative distance average error σ r (m), relative azimuth average error σ φ (rad), relative velocity magnitude error σ v (m/s), relative velocity horizontal direction average error σ φ, v (rad). The comparison shows that the performance of the dynamic obstacle state estimation algorithm based on the fusion of LiDAR and visible light camera is significantly improved compared to single LiDAR in a variety of motion forms, which verifies the effectiveness of the fusion algorithm. After actual flight tests, the maximum detection distance of this size of small rotary wing UAV is about 50 m.

Conclusions
In this paper, we design a fusion detection method for a variety of static obstacles in a low-altitude suburban environment and a state estimation method for un-known dynamic obstacles based on the UAV SAA system sensor configuration scheme of LiDAR and visible light camera. Based on the data verification in the actual scenes, the above methods can effectively improve the accuracy of detecting the distance and size information of static obstacles and the accuracy of estimating the state of dynamic obstacles. For the future work, research on related evasion algorithms can be carried out, and a complete SAA system based on the fusion of LiDAR and visible light camera can be constructed and tested in actual scenes.

Availability of data and materials
The data in this paper is not available.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.