Keywords

1 Introduction

Multi-sensor data fusion is a novel technology for collecting and processing information. With the development and application of unmanned system technology, intelligent equipment needs to realize information perception of the surrounding environment based on external sensors [1], in order to realize unmanned operation. Lidar can obtain the distance of the target and provide precise and accurate three-dimensional point cloud data, but it can not get rich other environmental information;

Monocular camera can collect various environmental information, but it can not obtain accurate distance information. Considering the characteristics of both, the fusion of lidar and monocular camera sensing information can well obtain various environmental information around intelligent equipment and provide necessary information feedback for unmanned operation of intelligent equipment. To complete information fusion, the first thing to do is to conduct joint calibration among multiple sensors [2]. This is in order to obtain the relative position between the respective sensors, and find out the conversion relationship between the coordinates of each sensor [3]. In this paper, a joint calibration method based on LIDAR point cloud data and two-dimensional data of gray image is proposed. A rectangular standard plate is used as the calibration plate to verify the effectiveness of the method.

2 Monocular Camera Calibration

The purpose of monocular camera calibration is to realize the rapid conversion between monocular sensor coordinate system and world coordinate system, obtain the relative position relationship between them, and obtain the internal and external parameters of monocular sensor.

2.1 Pinhole Camera Model

Fig. 1.
figure 1

Linear camera model

As shown in Fig. 1, a point O in space is the projection center of the pinhole camera, F, that is OP represents the distance from point O to point P on the plane. Project point X in space onto planes can obtain projection point P.

The image plane of the camera is plane s, where the optical center of the camera is point O and the focal length of the camera is OM, which can be expressed by f, the optical axis of the camera is a ray emitted outward with the optical center of the camera as the starting position, also known as the main axis. The optical axis of the camera is perpendicular to plane s, and the optical axis has an intersection with image plane s, which is called the main point of the camera.

$$\uplambda \left[ {\begin{array}{*{20}c} {\overline{x}} \\ {\overline{y}} \\ 1 \\ \end{array} } \right] = K\left[ {\begin{array}{*{20}c} {x_{c} } \\ {y_{c} } \\ {x_{c} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {f_{u} } & s & {u_{0} } \\ 0 & {f_{v} } & {v_{0} } \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{c} } \\ {y_{c} } \\ {z_{c} } \\ \end{array} } \right]$$
(1)

In Formula 1, matrix K is the internal parameter matrix of the camera. We can do a very fast transformation from the camera coordinate system to the image coordinate system through the internal reference matrix. \((f_{u} ,f_{v} )\) is the focal length parameter of the camera. The focal length is the distance between the world and the image plane. Under the pinhole camera model, the two values are the same. \((u_{0} ,v_{0} )\) is the offset of the main point from the image plane. When the U-axis of the image coordinate system is not completely perpendicular to the v-axis, the s generated is called distortion factor.

2.2 Camera Calibration Principle

Camera Calibration Principle [4]:

If ranging is carried out through gray image, In order to obtain the three-dimensional coordinates of a point on an object in space and its corresponding point in the camera image more quickly and accurately, and get the change and conversion between them, we need to establish a geometric model based on gray image, and the parameters of the camera constitute a basic parameter of the geometric model. Through a lot of calculation and practice, these parameters can be solved and given accurately. This process is called the camera calibration process.

2.3 Coordinate System Under Camera

Coordinate System:

Four coordinate systems in the camera imaging model:

  1. a.

    World coordinate system: a coordinate system established with a reference point outside, the coordinate points are (XW, YW, ZW)

  2. b.

    camera coordinate system: a coordinate system established with the optical center of monocular camera as the reference point, and the coordinate points are (x, y, z)

  3. c.

    Image coordinate system: the optical center is projected on the imaging plane, and the obtained projection point is used as the reference point to establish a rectangular coordinate system. The coordinate point is (x, y)

  4. d.

    pixel coordinate system: the coordinate system that can be seen by the end user. The origin of the coordinate system is in the upper left corner of the image, and the coordinate point is (u, v)

Various transformation relations from the world coordinate system to the pixel coordinate system are shown in Fig. 2:

Fig. 2.
figure 2

Conversion from world coordinate system to pixel coordinate system

The conversion relationship between coordinates is shown in Fig. 3:

Fig. 3.
figure 3

Schematic diagram of coordinate system relationship

  1. a)

    The transformation formula between the world coordinate system and the camera coordinate system is shown in Eq. 2:

    $$\left[\begin{array}{c}x\\ \begin{array}{c}y\\ z\\ 1\end{array}\end{array}\right]=\left[\begin{array}{cc}R& T\\ {0}^{T}& 1\end{array}\right] \cdot \left[\begin{array}{c}{X}_{w}\\ \begin{array}{c}{Y}_{w}\\ {Z}_{w}\\ 1\end{array}\end{array}\right]=\left[\begin{array}{cc}\begin{array}{cc}{r}_{11}& {r}_{12}\\ {r}_{21}& {r}_{22}\end{array}& \begin{array}{cc}{r}_{13}& {t}_{x}\\ {r}_{23}& {t}_{y}\end{array}\\ \begin{array}{cc}{r}_{31}& {r}_{32}\\ 0& 0\end{array}& \begin{array}{cc}{r}_{33}& {t}_{z}\\ 0& 1\end{array}\end{array}\right]\cdot \left[\begin{array}{c}{X}_{w}\\ \begin{array}{c}{Y}_{w}\\ {Z}_{w}\\ 1\end{array}\end{array}\right]$$
    (2)

    where matrix R is the rotation matrix. And R meets the following conditions:

    $$\left\{\begin{array}{c}{r}_{11}^{2}+{r}_{12}^{2}+{r}_{13}^{2}=1\\ {r}_{21}^{2}+{r}_{22}^{2}+{r}_{23}^{2}=1\\ {r}_{31}^{2}+{r}_{32}^{2}+{r}_{33}^{2}=1\end{array}\right.$$
    (3)

    The R matrix contains three variables, \({\mathrm{R}}_{\mathrm{x}}\), \({\mathrm{R}}_{\mathrm{y}}\), \({\mathrm{R}}_{\mathrm{z}}\), \({\mathrm{t}}_{\mathrm{x}}\), \({\mathrm{t}}_{\mathrm{y}}\), \({\mathrm{t}}_{\mathrm{z}}\) which together are called the external parameters of camera.

  2. b)

    The transformation relationship between the image coordinate system and the camera coordinate system is as follows:

    $$\mathrm{z}\left[\begin{array}{c}X\\ \begin{array}{c}Y\\ 1\end{array}\end{array}\right]=\left[\begin{array}{cc}\begin{array}{cc}f& 0\\ 0& f\end{array}& \begin{array}{cc}0& 0\\ 0& 0\end{array}\\ \begin{array}{cc}0& 0\\ 0& 0\end{array}& \begin{array}{cc}1& 0\\ 1& 0\end{array}\end{array}\right] \left[\begin{array}{c}x\\ \begin{array}{c}y\\ z\\ 1\end{array}\end{array}\right]$$
    (4)

    This conversion relationship is from 3D to 2D, which belongs to the relationship of perspective projection. After this conversion, the monocular of the projection point is not converted to pixels, so the next conversion is carried out.

  3. c)

    The actual relationship between image coordinate system and pixel coordinate system is as follows:

    $$\left\{\begin{array}{c}u= \frac{X}{dx}+{u}_{0}\\ v= \frac{Y}{dy}+{v}_{0}\end{array}\right.$$
    (5)
    $$\left\{\begin{array}{c}u-{u}_{0}=\frac{X}{dx}={s}_{x}\cdot X\\ v-{v}_{0}=\frac{Y}{dy}={s}_{y} \cdot Y\end{array}\right.$$
    (6)

    Because both the image coordinate system and the pixel coordinate system are located on the image plane, they are only different in scale. Except for the origin and their respective units, they are the same.

  4. d)

    Transformation between camera coordinate system and pixel coordinate system.

    $$\left\{\begin{array}{c}u-{u}_{0}=\frac{f{s}_{x}x}{z}={f}_{x}x/z\\ v-{v}_{0}=\frac{f{s}_{y}y}{z}={f}_{y}y/z\end{array}\right.$$
    (7)

    \({f}_{x}\) is the focal length in the axial direction and \({f}_{y}\) is the focal length in the axial direction, \({f}_{x}\), \({f}_{y}\), \({u}_{0}\), \({v}_{0}\). It are called the internal parameters of the camera, because these four elements are related to the structure of the camera itself.

  5. e)

    Transformation relationship between pixel coordinate system and world coordinate system:

$$\mathrm{z}\left[\begin{array}{c}u\\ v\\ 1\end{array}\right]=\left[\begin{array}{cc}\begin{array}{cc}{f}_{x}& 0\\ 0& {f}_{y}\end{array}& \begin{array}{cc}{u}_{0}& 0\\ {v}_{0}& 0\end{array}\\ \begin{array}{cc}0& 0\end{array}& \begin{array}{cc}1& 0\end{array}\end{array}\right] \left[\begin{array}{cc}R& T\\ {0}^{T}& 1\end{array}\right] \left[\begin{array}{c}{X}_{w}\\ \begin{array}{c}{Y}_{w}\\ {Z}_{w}\\ 1\end{array}\end{array}\right]={M}_{1} \cdot {M}_{2} \cdot X=M \cdot X$$
(8)

Using the above mathematical expression, we can uniquely determine the internal parameters of the camera, correspond the collected corner coordinates with their image point coordinates one by one, and calculate the internal and external parameters of the camera to complete the calibration of the camera.

Specific implementation steps:

  1. 1.

    Preprocessing the image

  2. 2.

    Edge detection

  3. 3.

    Extracting the contour of the calibration plate

  4. 4.

    Corner detection

  5. 5.

    Calibration

The corner point, internal parameter and external parameter matrix of the camera are shown in the Figs. 4 and 5 below:

Fig. 4.
figure 4

Camera calibration corner diagram

Fig. 5.
figure 5

Camera calibration parameters

3 Lidar Calibration

Line scan lidar is selected in this scheme, and 16 line specifications are selected. The operation principle of the lidar is as follows: the target distance is measured through the transceiver of the laser signal. The lidar controls the scanning of the lidar by controlling the rotation of the internal motor - scanning the linear array to the external environment, the distance from the lidar to the target object is calculated according to the TOF flight principle. There is a laser transmitter and a laser receiver inside the lidar. During operation, the lidar emits the laser. At the same time, the internal timer starts timing. When the laser hits the target, the reflection occurs, and the laser returns to the laser receiver. The timer records the arrival time of the laser, The actual movement time is obtained by subtracting the start time from the return time. Because of the principle of constant speed of light (TOF), the actual distance can be obtained through calculation.

The lidar coordinate system depicts the relative position of the object relative to the lidar, as shown in Fig. 6:

Fig. 6.
figure 6

Schematic diagram of lidar coordinate system

When collecting data, the laser line ID can be used through Table 1. Because the laser point has its own specific ID, the unique laser line inclination can be obtained. The query table is shown in Table 1. According to the distance value r actually measured by the lidar, the coordinate x0 of the laser point in the scanning plane coordinate system can be obtained through formula 9 [5].

$${X}_{0}=\left[\begin{array}{c}{x}_{0}\\ {y}_{0}\\ 0\end{array}\right]=\left[\begin{array}{c}r\;\mathit{sin}\;\omega \\ r\;\mathit{cos}\;\omega \\ 0\end{array}\right]$$
(9)
Table 1. Vertical angles (ω) by laser ID and model

When the lidar is scanning, a scanning angle can be obtained α, This is the angle between the scanning plane and the lidar coordinate plane. The scanning plane coordinates are transformed into lidar coordinates, and the rotation matrix is

$${R}_{x}=\left[\begin{array}{c}1\\ 0\\ 0\\ \mathit{sin}\; a\\ -\mathit{cos}\;a\\ \mathit{cos}\;a\\ \mathit{sin}\; a\end{array}\right]$$
(10)

Obtain the coordinates of the target corner in the lidar coordinate system:

$${X}_{C}=\left[\begin{array}{c}x\\ y\\ z\end{array}\right]={R}_{x}\times {X}_{0}$$
(11)

4 Joint Calibration of Lidar and Camera:

The camera coordinate system and lidar coordinate system are established to obtain the target corner coordinates in their respective field of view. In the lidar coordinate system, it is a 3D corner coordinate, while in the camera coordinate system, it is a 2D corner.

Lidar coordinate system to camera coordinate system:

$$z\left[\begin{array}{c}u\\ v\\ 1\end{array}\right]=\left[\begin{array}{cc}\begin{array}{cc}{f}_{x}& 0\\ 0& {f}_{y}\end{array}& \begin{array}{cc}{u}_{0}& 0\\ {v}_{0}& 0\end{array}\\ \begin{array}{cc}0& 0\end{array}& \begin{array}{cc}1& 0\end{array}\end{array}\right] \cdot \left[\begin{array}{cc}R& T\\ {0}^{T}& 1\end{array}\right] \cdot \left[\begin{array}{c}X\\ Y\\ Z\\ 1\end{array}\right]$$
(12)

Joint calibration can be realized by the following methods:

  1. 1.

    Correspondence between 3D points and 2D planes [6]

  2. 2.

    Calibration based on multi-sensor motion estimation [7]

  3. 3.

    Calibration is completed by maximizing mutual information between lidar and camera [8]

  4. 4.

    Volume and intensity data registration based on geometry and image [9]

To complete the transformation from 3D points to 2D points, I choose to use PNP algorithm [10] (complete the matching of 3D points to 2D points) to calculate the rotation and translation vectors between the two coordinate systems. The final conversion relationship is as follows:

$${\mathrm{X}}_{\mathrm{c}}=\mathrm{MX}+\mathrm{H}$$
(13)

In the above formula, M is the rotation matrix, which records the transformation relationship between the lidar coordinate system and the camera coordinate system, and H is the translation vector, which records the transformation relationship between the origin of the lidar coordinate system and the camera coordinate system. Finally, the joint calibration between lidar and camera can be completed by unifying the obtained 3D points and 2D points.

PNP algorithm: Taking the lidar coordinate system as the world coordinate system, select the three-dimensional feature points in the lidar coordinate system and the coordinate points of the feature points projected into the image coordinate system through perspective, so as to obtain the pose relationship between the camera coordinate system and the lidar coordinate system, including R matrix and t matrix, and complete the matching of 3D points to 2D points.

Requirements for feature points: it is necessary to know not only the coordinates in the three-dimensional scene, but also the coordinates in the two-dimensional image, so that a certain solution can be obtained for perspective projection. We select four corners of the rectangular board as feature points, 3D points are A, B, C, D, and 2D points are a, b, c, d. Triangles have the following similar relationships (Fig. 7):

Fig. 7.
figure 7

Pose diagram of camera coordinate system relative to lidar coordinate system

where: △Oab − △OAB, △Oac − △OAC, △Obc − △OBC.

  1. (1)

    According to the cosine theorem:

    $$OA^{2} + OB^{2} - 2 \cdot OA \cdot OB \cdot cos \,{<}a,b{>} = AB^{2}$$
    (14)
    $$OA^{2} + OC^{2} - 2 \cdot OA \cdot OC \cdot cos \,{<}a,c{>} = AC^{2}$$
    (15)
    $$OB^{2} + OC^{2} - 2 \cdot OB \cdot OC \cdot cos \,{<}b,c{>} = BC^{2}$$
    (16)
  2. (2)

    Eliminate the above formula, that is, divide by OC2 at the same time, and \(x=\frac{OA}{OC}\), \(y=\frac{OB}{OC}\). You can get:

    $$x^{2} + y^{2} - 2 \cdot x \cdot y \cdot cos\, {<}a,b{>} = AB^{2} /OC^{{^{2} }}$$
    (17)
    $$x^{2} + 1 - 2 \cdot x \cdot y \cdot cos\, {<}a,c{>} = AC^{2} /OC^{2}$$
    (18)
    $$y^{2} + 1 - 2 \cdot x \cdot y \cdot cos \,{<}b,c{>} = BC^{2} /OC^{2}$$
    (19)
  3. (3)

    Let u = \((AB^2)/(OC^2 )\), v = \((BC^2)/(AB^2 )\), w = \((AC^2)/(AB^2 )\) then:

    $$x^{2} + y^{2} - 2 \cdot x \cdot y \cdot cos\, {<}a,b{>} = u$$
    (20)
    $$x^{2} + 1 - 2 \cdot x \cdot y \cdot cos \,{<}a,c{>} = {\text{wu}}$$
    (21)
    $$y^{2} + 1 - 2 \cdot x \cdot y \cdot cos \,{<}b,c{>} = vu$$
    (22)
  4. (4)

    Simplified:

    $$\left( {1 - w} \right)x^{2} - w \cdot y^{2} - 2 \cdot x \cdot cos \,{<}a,c{>} + 2 \cdot w \cdot x \cdot y \cdot cos\,{<}a,b{>} + 1 = 0$$
    (23)
    $$\left( {1 - v} \right)x^{2} - v \cdot y^{2} - 2 \cdot y \cdot cos \,{<}b,c{>} + 2 \cdot v \cdot x \cdot y \cdot cos \,{<}a,b{>} + 1 = 0$$
    (24)

What we need to do is to solve the coordinates of A, B and C in the camera coordinate system through the above formula, in which the image position of 2D points and \(\mathrm{cos}\,{<}a,b{>}\), \(\mathrm{cos}\,{<}a,c{>}\), \(\mathrm{cos}\,{<}b,c{>}\) are known, and u and w can also be obtained. Therefore, it is transformed into the solution of the above binary quadratic equation.

The specific solution process of the above binary quadratic equations is as follows:

  1. 1.

    The two binary quadratic equations are equivalent to a set of characteristic columns, and the equivalent equations are as follows:

    $$a_{4} x^{4} + a_{3} x^{3} + a_{2} x^{2} + a_{1} x^{1} + a_{o} = 0$$
    (25)
    $$b_{1} y - b_{0} = 0$$
    (26)
  2. 2.

    According to Wu's elimination method, we can get that a1-a4 are all known and obtain the values of x and y.

  3. 3.

    Calculate the values of OA, OB and OC

    $$x^{2} + y^{2} - 2 \cdot x \cdot y \cdot cos\,{<} a,b{>} = AB^{2} /OC^{2}$$
    (27)

    where: x = OA/OC, y = OB/OC.

  4. 4.

    Obtain the coordinates of A, B and C in the camera coordinate system:

    $$\mathrm{A}=\overrightarrow{\mathrm{a}}\cdot \Vert \mathrm{PA}\Vert$$
    (28)

Using PNP algorithm, because I use three groups of corresponding points and can get four groups of solutions, I use point d to verify the results and judge which group of solutions is the most appropriate.

The joint calibration results are shown as follows (Fig. 8):

Fig. 8.
figure 8

Joint calibration parameters

5 Experiments

Verify the algorithm through the following experiments.

5.1 Experimental Equipment

This experiment selects velodyne 16 line lidar, narrow_sterto monocular camera. in the experiment, we fixed the relative position of the lidar and the camera. The fixing diagram of the calibration plate is shown in the figure, and the calibration plate is located in front of the lidar (Fig. 9).

Fig. 9.
figure 9

Schematic diagram of placing lidar, camera and calibration plate

The selected experimental equipment is shown in Table 2:

Table 2. Experimental equipment

5.2 Experimental Results

According to the algorithms in the previous sections, we completed the following experiments:

  1. (1)

    The lidar and camera are fixed at corresponding positions respectively. The height of the camera is 1.3 m and the height of the lidar is 1.2 m

  2. (2)

    we used a fixed 12 * 9 chessboard grid calibration board which the distance of each grid is 30 mm. It is placed about 4 or 5 m away from the front of the lidar. The lidar and the camera collect images at the same time. In addition, a rectangular wooden board is used to complete the image acquisition.

  3. (3)

    Move the position of the calibration plate and board, and then re collect the image.

  4. (4)

    We can obtained the two-dimensional corner coordinates of the four corners of the board in the camera image and the three-dimensional coordinates of the lidar image.

  5. (5)

    We used the 11 * 8 corners of the chessboard calibration board to complete the separate calibration of the camera, and then the coordinate values of the four corresponding corners of the rectangular board are used to complete the joint calibration of the two.

The individual calibration results of the camera are shown in Table 3 below. Because there are too many chessboard corners, which are 11 * 8, 10 of them are selected:

The results obtained after joint calibration of lidar and camera are shown in Table 4 below:

Table 3. Camera calibration results

Table 3 shows the results of camera calibration separately. After obtaining the measured values of image corner coordinates, the calculated values of specific image corners are obtained by re projection, using the three-dimensional coordinates of corners under the camera and the internal and external parameter matrix of the camera. Compared with the measured values, the average error of camera calibration is 0.0146333 pixels.

Table 4. Joint calibration results

Table 4 shows the conversion results after joint calibration. This result is that the rapid conversion from the coordinate system of lidar to the pixel coordinate system corresponding to the camera can be completed by using the R, T matrix between lidar and camera and the internal parameter matrix of camera. Compared with the measured values of camera, it is concluded that the average error of joint calibration is 1.81792 pixels.

It is obvious from the above two tables that the accuracy of the camera itself is still quite accurate, with an average error of 0.0146333 pixels, which meets the required accuracy requirements. However, because the lidar itself is not very accurate and its quantization accuracy is decimeter level, the joint accuracy obtained after joint calibration is compared with the calibration accuracy of the camera, The accuracy of joint calibration is slightly poor.

6 Conclusion

In order to realize the multi-sensor fusion of lidar and camera, a joint calibration method between lidar and camera sensors based on rectangular board is proposed in this paper. The experimental results show that this method has certain practical significance.