1 Introduction

It is possible nowadays using the increasing capability and ongoing miniaturization of hardware and sensors to construct various kinds of multi-sensor systems (MSS). The MSS are frequently used in engineering geodesy to capture the environment. It is crucial to know the position and orientation of the MSS with respect to a superordinate coordinate systemFootnote 1 for a further processing of the data captured. The task to determine position and orientation in a superordinate coordinate system is called georeferencing. Georeferencing usually comprises the determination of three translations and three rotations, also known as six Degrees of Freedom (DoF) or pose.

Unmanned aerial vehicles (UAV) have become a widespread, useful and affordable platform for the MSS in the past decade, optimally suited to capture common scenarios in engineering geodesy. The MSS mounted on a UAV are subsequently called unmanned aerial systems (UAS).

Precise georeferencing of a UAS is a challenging task, especially in urban areas. Global Navigation Satellite System (GNSS) data are affected by shadowing and multipath effects and are, therefore, inaccurate or even unavailable. Basically, low-cost GNSS receivers obtain positional accuracies of a few meters. These accuracies may be improved to decimeter or even centimeter level under good GNSS conditions and using satellite-based augmentation systems, differential GNSS, or real-time kinematics. Lightweight and low-cost inertial measurement units (IMU)Footnote 2 obtain orientation accuracies of \(0.1^{\circ}\) for roll and pitch and slightly worse accuracy, \(0.8^{\circ}\), for heading. Combined GNSS-IMU systems improve the heading accuracy to 0.2°–\(0.5^{\circ}\). However, a general problem with IMU data is that it is often seriously affected by drifts.

A further possibility for georeferencing of an MSS is tracking using an external sensor. This is a cumbersome and inefficient task in the case of georeferencing a UAS due to limited measuring ranges and occultations.

In this paper, we present the georeferencing of a UAS, which is equipped with a three-dimensional (3D) laser scanner, a low-cost GNSS receiver, and an IMU, among other sensors. We set ourselves the requirement to determine the pose of the UAS with a precision (as a measure of the accuracy) better than 10 cm in position and \(0.1^{\circ}\) for orientation. The accuracy of the pose is decisive for the further results derived, such as digital terrain models or detailed 3D city models. Therefore, we fused the UAS-based laser scanner measurements towards building facades, which are modeled in a 3D city model, and the other sensor data within an iterated extended Kalman filter (IEKF). We developed a software simulation for testing the IEKF developed using different scenarios to evaluate the functionality and performance of the IEKF implemented. All results presented are part of a research project, subsequently called the UAS project.

1.1 Georeferencing of MSS/UAS

The strategies to accomplish the georeferencing vary depending on the measurement configuration, the sensors and hardware available, the environment, and the accuracies to be fulfilled. The georeferencing strategies can be generally classified into direct, indirect, and data-driven georeferencing (see, e.g., Schuhmacher and Böhm 2005; Paffenholz 2012; Holst et al. 2015). An alternative classification regarding indoor applications is given in Vogel et al. (2016).

For directFootnote 3 georeferencing, sensors which measure the pose of an MSS directly are integrated into the MSS. These sensors could be, for example, a GNSS receiver (Paffenholz 2012; Talaya et al. 2004), an IMU (Talaya et al. 2004), or an external sensor, such as a total station or a laser tracker (Dennig et al. 2017; Hartmann et al. 2018), which determine the pose by angle and distance measurements to a reflector on the MSS.

IndirectFootnote 4 georeferencing comprises methods where the pose of the MSS is determined by measurements towards known targets. These targets may be flat markers with a specific pattern (Abmayr et al. 2008) or simple 3D geometries, such as cylinders or spheres (Elkhrachy and Niemeier 2006). The position of the targets within the superordinate coordinate system is determined using an external sensor, for example, a total station. Indirect georeferencing is commonly used in bundle adjustment or for the georeferencing of static terrestrial laser scanners (TLS).

Data-driven georeferencing conforms basically to the indirect georeferencing. Instead of known targets, the data sets are matched to reference data sets. These reference data sets may be point clouds georeferenced already (Soloviev et al. 2007; Glira et al. 2015), digital surface models or 3D city models (Hebel et al. 2009; Li-Chee-Ming and Armenakis 2013; Unger et al. 2016, 2017). The matching is accomplished, for instance, via point-to-point assignment, for example, an iterative closest point algorithm (Besl and McKay 1992) or point-to-object assignment (see, e.g., Schuhmacher and Böhm 2005). Please note that data-driven georeferencing approaches vary widely. A huge part can be found in the commonly used Simultaneous Localization and Mapping (SLAM) approaches (see, e.g., Nguyen et al. 2006; Lee et al. 2007; Jutzi et al. 2013; Kaul et al. 2015; Nüchter et al. 2015).

In principle, all the georeferencing strategies mentioned are suitable for georeferencing static and kinematic MSS. For a static MSS, the six DoF have to be determined only once, whereas for a kinematic MSS, the six DoF for each measuring epoch should be determined. It may be indispensable to model the pose almost continuously depending on the measuring frequency of the sensors used.

1.2 Filtering Techniques for Georeferencing

Georeferencing of a kinematic MSS is commonly covered as part of the state parameter vector within a filtering approach. Filtering is a two-step procedure in which the current state parameter vector is estimated based on the previous state parameter vector and the current observations. In the first step, called prediction, the current state parameter vector is predicted using the previous state parameter vector and a specific system model. The system model takes into account all the controlling variables and environmental noises that affect the current state. In the second step, called the measurement update step, the predictions are modified based on the observation equations which compare the predicted observations with the current original ones.

In recent years, many pose estimation algorithms based on the extended Kalman filter (EKF) have been successfully applied to solve the pose estimation problem of a UAS. Tailanian et al. (2014) focus on the sensor fusion of the local sensors of a UAS in which the GNSS and the IMU have been combined by means of an EKF. In Hol et al. (2007), pose estimation on a six DOF robot using an EKF to fuse this information has been shown. Forster et al. (2013) use the collaboration of multiple UAS for pose estimation to combine multiple SLAM algorithms and create an accurate pose estimation. In their approach, real-time camera pose estimation is accomplished by combining the inertial and vision measurements using nonlinear state estimation approaches.

In the case of using nonlinear observation equations, a linearization (realized by means of Taylor series expansion) has to be applied to overcome the nonlinearity issue (see Denham and Pines 1966). In the case where the effects of the linearization errors tend to affect the efficiency of the filter or its convergence, the re-linearization of the measurement equation around the updated state may reduce these difficulties. Therefore, such a procedure is called IEKF. Researchers commonly use explicit observation equations in the IEKF. This means that the observations are considered as a function of the state parameters. Such an observation model is generally called a Gauss Markov model (GMM).

If the equations relating the observations to the state parameters are condition equations, for example, some 3D points should fulfill the plane equations, then we do not exhibit the typical formulation of the GMM. In other words, the observations and the state parameters are not separable (implicit measurement equations, see Dang 2007). In such a case, we are dealing with a Gauss Helmert model (GHM). In Dang (2007, 2008), Steffen and Beder (2007), Steffen (2013) and Ettlinger et al. (2018), an IEKF is used which deals with implicit measurement equations. In Vogel et al. (2018), an IEKF is used for georeferencing and extended with additional nonlinear equality constraints. In Vogel et al. (2019), the approach is further extended by integrating nonlinear inequality constraints. With the ability to handle implicit measurement equations and nonlinear inequality constraints, it is possible to depict almost any mathematical relationship within an IEKF.

1.3 Contribution

In this paper, we present an adaption of the IEKF in Vogel et al. (2018) for data-driven georeferencing of a UAS with an accuracy better than 10 cm in position and \(0.1^{\circ}\) in orientation. The highlight of the IEKF presented is the fusion of laser scanner measurements towards building facades, modeled as planes in a 3D city model, with the corresponding plane parameters of the 3D city model. This fusion leads to nonlinear implicit measurement equations and additional nonlinear equality constraints, which are covered within the IEKF. To evaluate the functionality and performance of the IEKF, we developed a simulation tool with a simple dynamic system model defining the state of the UAS and other states. On the basis of the states defined, we simulated sensor measurements regarding the sensor specifications and further characteristics. We tested the IEKF for different scenarios and analyzed its sensitivity towards different data characteristics.

1.4 Outline

The dedicated sections are as follows:

In Sect. 2 we give a detailed description of the UAS, the IEKF implemented and the entire workflow for georeferencing. Section 3 displays the simulation tool developed and gives an overview of the specifically chosen values for the IEKF. We present and discuss the results of the IEKF applied to the simulation in Sect. 4. Finally, conclusions and an outlook are drawn in Sect. 5.

2 IEKF for Georeferencing of a UAS

The Geodetic Institute (GIH) and the Institute of Photogrammetry and GeoInformation (IPI) of Leibniz University Hannover (LUH) are currently working on the UAS project. The UAS project deals with the precise determination of the trajectory of a UAS by integrating camera and laser scanner data in combination with generalized object information. Within this paper, we will focus on the usage of laser scanner data in combination with a non-generalized 3D city model. We will exclusively use simulated data to evaluate the functionality and performance of the algorithms developed. The simulation scenarios were chosen in accordance with common measurement scenarios and data characteristics.

Fig. 1
figure 1

Schematic depiction of the UAS capturing the environment. The points measured (red dots) may represent the ground, vegetation, building facades, or other objects

Fig. 2
figure 2

Schematic depiction of the assignment of the transformed measured points. Some building facades are represented as planes in a generalized 3D city model (solid blue lines). The laser scanner measurements are transformed in the superordinate coordinate system according to roughly known translation \({\varvec{t}}=\left[ t_x,t_y,t_z\right]\) and orientation \({\varvec{o}}=\left[ \omega ,\phi ,\kappa \right]\). The origin of the laser scanner is depicted in the upper left corner. Note the slight difference to the original measurement configuration due to the roughly known pose. The transformed points which are close to one of the planes are assigned to that plane (green dots). The closeness is determined by a distance threshold (dashed blue lines). In this example, 5 points are assigned to plane 1 and 4 points are assigned to plane 2, whereas the other points (red dots) are not assigned. Subsequently, only the green dots are used as observations in the GHM, resp. filtering approach

2.1 General Idea

The UAS moves through an urban area where the building facades are modeled as planes within a 3D city model. Three-dimensional city models with a level of detail 1 (block model) and a level of detail 2 (model with differentiated roof structures and boundary surfaces) are freely available for many cities. A detailed classification of the different levels of detail is given in Gröger et al. (2012). The pose of the UAS is roughly known, for example, from measurements of a GNSS receiver and an IMU. A 3D laser scanner captures the environment (red dots in Fig. 1) continuously. These captured 3D points may represent the ground, vegetation, building facades, or other objects. The laser scanner measurements are given in the local sensor coordinate system. The laser scanner measurements are transformed in a superordinate coordinate system according to the roughly known translation \({\varvec{t}}=\left[ t_x,t_y,t_z\right]\) and orientation \({\varvec{o}}=\left[ \omega ,\phi ,\kappa \right]\) (see Fig. 2). The transformed laser scanner measurements are assigned to planes of the 3D city model based on the distance between the scanner points and the planes of the building model (see Sect. 2.3). Only the points that are close enough to a plane of the 3D city model (green points in Fig. 2) are used as observations afterwards. The final pose parameters are estimated within a GHM, resp. filtering approach, by minimizing the distance between the assigned laser scanner measurements and the planes of the 3D city model to which they are assigned.

Fig. 3
figure 3

Sketch of the platform setup. The x-axis of the local laser scanner coordinate system (red) points in readers line of sight

2.2 UAS

A UAV is equipped with a 3D laser scanner, cameras, a GNSS receiver and an IMU in the UAS project. Figure 3 depicts a simplified sketch of the platform setup (without cameras). The laser scannerFootnote 5 scans 16 scan lines which are nearly perpendicular to the sensors’ vertical axis (that corresponds to the z-axis in Fig. 3). The divergence between each scan line is \(2^{\circ}\). Thus, the laser scanner has a field of view of \(30^{\circ}\times {360}^{\circ}\). It is possible to set the resolution of the points in the scan lines between 0.1° and \(0.4^{\circ}\). The rotation rate depends on that setting and is between 5 Hz and 20 Hz. We set the resolution to \({0.4}^{\circ}\) to obtain a higher rotation rate of 20 Hz. Furthermore, it is possible to exclude certain angle areas from the measurement, because they cannot provide any data or the data generated are not needed for further processing. In our case, it would make sense to exclude the angle areas where the laser scanner measures towards the UAV.

2.3 Assignment Algorithm

A crucial task of our georeferencing process is the assignment of measured 3D points \({\varvec{C}}^{\text {loc}}\), given in a local coordinate system of the laser scanner, to a plane of the city model, given in the global coordinate system. Therefore, we first have to transform \({\varvec{C}}^{\text {loc}}\) to \({\varvec{C}}^{\text {glo}}\):

$$\begin{aligned} {\varvec{C}}^{\text {glo}}&={\varvec{t}}+ {\varvec{R}}\cdot {\varvec{C}}^{\text {loc}}. \end{aligned}$$
(1)

The parameters of the translation \({\varvec{t}}\) describe the position of the origin of the local coordinate system. The rotation matrix \({\varvec{R}}\) is obtained based on the orientation \({\varvec{o}}=\left[ \omega ,\phi ,\kappa \right]\) of the local coordinate system according to  Luhmann (2013):

$${\varvec{R}}_{\omega } = \left[ {\begin{array}{*{20}l} 1 \hfill &\quad 0 \hfill &\quad 0 \hfill \\ 0 \hfill &\quad {\cos \omega } \hfill &\quad { - \sin \omega } \hfill \\ 0 \hfill &\quad {\sin \omega } \hfill &\quad {\cos \omega } \hfill \\ \end{array} } \right]$$
(2)
$${\varvec{R}}_{\phi } = \left[ {\begin{array}{*{20}l} {\cos \phi } \hfill &\quad 0 \hfill &\quad {\sin \phi } \hfill \\ 0 \hfill &\quad 1 \hfill &\quad 0 \hfill \\ { - \sin \phi } \hfill &\quad 0 \hfill &\quad {\cos \phi } \hfill \\ \end{array} } \right]$$
(3)
$${\varvec{R}}_{\kappa } = \left[ {\begin{array}{*{20}l} {\cos \kappa } \hfill &\quad { - \sin \kappa } \hfill &\quad 0 \hfill \\ {\sin \kappa } \hfill &\quad {\cos \kappa } \hfill &\quad 0 \hfill \\ 0 \hfill &\quad 0 \hfill &\quad 1 \hfill \\ \end{array} } \right]$$
(4)
$$\begin{aligned} {\varvec{R}}&={\varvec{R}}_{\omega }\cdot {\varvec{R}}_{\phi }\cdot {\varvec{R}}_{\kappa }. \end{aligned}$$
(5)

The assignment is realized by a simple distance criterion, as described in  Unger et al. (2016, 2017). The Euclidean distance to each plane of the city model is calculated for each 3D point in \({\varvec{C}}^{\text {glo}}\). Whether the projection of the points into the plane lies within the bounding polygon of the planes is checked for points where the distance to the nearest plane is less than a threshold value \(d_{\text {assign}}\). If it is outside, the distance from the point to the boundary polygon of the plane is calculated and replaces the orthogonal distance. Points that project inside the boundary or are closer than \(d_{\text {assign}}\) are assigned to their closest plane. \(d_{\text {assign}}\) has to be selected regarding the accuracy of the points as well as the accuracy of the city model. The accuracy of the points depends mainly on the accuracy of the translation and orientation parameters \({\varvec{t}}\) and \({\varvec{o}}\) and the accuracy of the measured points \({\varvec{C}}^{\text {loc}}\). The accuracy of the city model depends on the geometrical accuracy of its vertices and the extent of generalization effects by which the model deviates from the reality captured. The larger the threshold is fixed, the more points are assigned to planes, but the higher the probability is that incorrect assignments will be set that lead to incorrect results.

2.4 IEKF (Iterated Extended Kalman Filter)

The IEKF which we adapted for georeferencing of the UAS is given in Vogel et al. (2018). This approach can be applied for many different use cases. All available observations, types of prior information, and requested states need to be linked with the respective uncertainty information. Based on this, all information can be introduced into the IEKF.

According to the respective application, several relationships have to be established by means of linear or rather nonlinear functional models. A system model \({\varvec{f}}(\cdot )\) is needed for the requested states \({\varvec{x}}_\textit{k}\) to describe the physical behavior of the system from epoch \(\textit{k}-1\) to the current epoch \(\textit{k}\). Realization of this transfer is carried out within the prediction step of the IEKF. Suitable functional relationships for each observation \({\varvec{l}}_\textit{k}\) have to be formulated within the measurement model \({{\varvec{h}}}(\cdot )\). Based on implicit (\({{\varvec{h}}}(E({\varvec{l}}),\,{\varvec{x}})=\mathbf 0\)) or explicit (\(E({\varvec{l}})-{{\varvec{h}}}{\varvec{({x})}}=\mathbf 0\)) formulations, relations between the observations available and requested states are considered. \(E(\cdot )\) is the expected operator of a random vector; here, the expected value of the observation vector \({\varvec{l}}\), which can be replaced by \(E({\varvec{l}})={\varvec{l}}+{\varvec{v}}\). This consideration of current observations is carried out within the measurement update step. Available prior information can also be integrated within the measurement model during the measurement update step. In addition, further suitable prior information can also be applied in terms of state constraints by means of additional linear (\({\varvec{D}}{\varvec{x}}_{\textit{k}}={\varvec{d}}\)) or nonlinear (\(\textit{g}\left( {\varvec{x}}_{\textit{k}}\right) ={\varvec{b}}\)) functions. Here, \({\varvec{D}}\) is a known matrix and \({\varvec{d}}\) and \({\varvec{b}}\) are known vectors with respect to related state constraints. Formulation of such restrictions can be implemented by means of equality constraints within the constraint step.

The basic workflow of this IEKF is depicted in Fig. 4 in a simplified way. Detailed equations for initialization, prediction, measurement update, and constraints are given in the following Sects. 2.4.12.4.6. Algorithm 1 given in Sect. 2.5 summarizes the required input, different computations steps and the output.

Fig. 4
figure 4

Simplified basic workflow of the IEKF (grey) for georeferencing of a UAS by means of unknown states (yellow), available observations (green), and known prior information (blue)

2.4.1 State Parameter Vector

The state parameter vector \({\varvec{x}}_{k}\) consists of two parts, as already depicted in Fig. 4. The first part \({\varvec{x}}_{{\text{State}},k}\in {\mathbb {R}}^{9}\) consists of the current position \({\varvec{t}}_{k}\), orientation \({\varvec{o}}_{k}\) and velocity \({\varvec{v}}_{k}\) of the UAS (see Eq. 6). These state parameters describe the state of the UAS.

The second part consists, on one hand, of the vector \({\varvec{x}}_{{\text{Plane}},k}\in {\mathbb {R}}^{4\cdot E}\), which contains the parameters of all the E city model planes in Hesse normal form (see Eq. 7). The Hesse normal form is defined by the \(3\times 1\) normal vector \({\varvec{n}}_{e}=[n_{x,e},n_{y,e},n_{z,e}]^{T}\) and the distance to the origin \(d_{e}\) with \(e=1\ldots E\). On the other hand, it consists of the vector \({\varvec{x}}_{V,k}\in {\mathbb {R}}^{3\cdot M}\), which contains all the M vertices of the building model (see Eq. 8):

$${\varvec{x}}_{{\text{State}},k} = \left[ \underbrace{t_{x,k},t_{y,k},t_{z,k}}_{{\varvec{t}}_{k}},\underbrace{\omega _{k}, \phi _{k}, \kappa _{k}}_{{\varvec{o}}_{k}},\underbrace{v_{x,k},v_{y,k},v_{z,k}}_{{\varvec{v}}_{k}}\right] ^{T}$$
(6)
$${\varvec{x}}_{{\text{Plane}},k} = \left[ \underbrace{{\varvec{n}}_{1,k};d_{1,k}}_{\text{ plane } \text{1 }};\cdots ;\underbrace{{\varvec{n}}_{E,k};d_{E,k}}_{\text{ plane } {} \textit{E}}\right]$$
(7)
$${\varvec{x}}_{V,k} = \left[ {\varvec{V}}^{\text {glo}}_{1,k};\ldots ;{\varvec{V}}^{\text {glo}}_{m,k};\ldots ;{\varvec{V}}^{\text {glo}}_{M,k}\right]$$
(8)

with

$$\begin{aligned} {{\varvec{V}}}^{\text {glo}}_{m,k}=\left[ \textit{V}^{\text {glo}}_{\textit{x,m,k}},\textit{V}^{\text {glo}}_{\textit{y,m,k}},\textit{V}^{\text {glo}}_{\textit{z,m,k}}\right] ^{T}. \end{aligned}$$
(9)

The integration of plane parameters and vertices into the state parameter vector is used to identify and correct planes that are not accurately represented in the 3D city model. Although this purpose is not part of this paper, we are already introducing the mathematical relationship.

The complete state parameter vector \({\varvec{x}}_{k}\) is arranged according to:

$${\varvec{x}}_{k} = \left[ {\varvec{x}}_{{\text{State}},k};\;{\varvec{x}}_{{\text{Plane}},k};\;{\varvec{x}}_{V,k}\right] ^{}.$$
(10)

2.4.2 Observation Vector

The MSS mentioned in Sect. 2.2 provides discrete 3D laser scanner point clouds (LSC) and 6D pose information by means of a GNSS receiver and an IMU. An LSC consists of a full scan rotation (\(30^{\circ}\times {360}^{\circ}\)). For the sake of simplification, we will not consider the movement of the UAV during the period of a full scan rotation in the following. This time period is a maximum of 0.05 s for the configuration chosen. It is possible to exclude certain angle areas of the laser scanner, as has already been described in Sect. 2.2. The time period for the non-excluded areas is further reduced by excluding the angular areas in which the laser scanner measures in the direction of the UAV. This is acceptable due to the planned velocity of the UAV of about \(1\,\hbox {m}\,\hbox {s}^{-1}\) and an angular velocity of \(2^{\circ}\,\hbox {s}^{-1}\). In further development steps, the LSC should consist of fewer and temporally closer 3D points, for example, just a half or a quarter scan rotation. In addition, the initial vertices of the 3D city model are introduced as observations into our approach. The observation vector \({\varvec{l}}_{\textit{k}}\) consists of three parts for each epoch \(\textit{k}=1\ldots \textit{K}\). The first part \({\varvec{l}}^{\text {loc}}_{{\text{Scan}},k}\) consists of the measured LSC in the local sensor coordinate system with a total of N 3D points. \({\varvec{l}}^{\text {loc}}_{{\text{Scan}},k}\) only contains the 3D points, which were assigned to a plane of the city modelFootnote 6 (for the assignment algorithm, see Sect. 2.3). These points are assorted on the basis of the planes they are assigned to:

$$\begin{aligned} {\varvec{l}}^{\text {loc}}_{{\text{Scan}},k} = \left[ {\varvec{l}}^{\text {loc}}_{{\text{Scan}},1,k};\ldots ;\;{\varvec{l}}^{\text {loc}}_{{\text{Scan}},e,k};\ldots ;\;{\varvec{l}}^{\text {loc}}_{{\text{Scan}},E,k}\right] \end{aligned}$$
(11)

with

$$\begin{aligned} {\varvec{l}}^{\text {loc}}_{{\text{Scan}},e,k} = \left[ {\varvec{p}}^{\text {loc}}_{e,1,k};\ldots ;\;{\varvec{p}}^{\text {loc}}_{e,i,k}; \ldots ;\;{\varvec{p}}^{\text {loc}}_{e,N_e,k}\right] \end{aligned}$$
(12)

and

$$\begin{aligned} {\varvec{p}}^{\text {loc}}_{e,i,k} = \left[ X^{\text {loc}}_{e,i,k},Y^{\text {loc}}_{e,i,k},Z^{\text {loc}}_{e,i,k}\right] ^T. \end{aligned}$$
(13)

\({\varvec{l}}^{\text {loc}}_{{\text{Scan}},e,k}\) is representing the vector with all \(N_e\) points assigned to plane e. The total number N of 3D points of the LSC stored in the observation vector is calculated by:

$$\begin{aligned} N=\sum _{e=1}^{E} N_{e}. \end{aligned}$$
(14)

The second part \({\varvec{l}}^{\text {glo}}_{{\text{Pose}},k}\) consists of the direct GNSS and IMU observations for the position and orientation:

$${\varvec{l}}^{\text {glo}}_{{\text{Pose}},k} = \left[ \underbrace{t^{\text {GNSS}}_\textit{x,k},t^{\text {GNSS}}_\textit{y,k}, t^{\text {GNSS}}_\textit{z,k}}_\mathbf{t ^{\text {GNSS}}_\textit{k}}, \underbrace{\omega ^{\text {IMU}}_{\textit{k}}, \phi ^{\text {IMU}}_{\textit{k}}, \kappa ^{\text {IMU}}_{\textit{k}}}_\mathbf{o ^{\text {IMU}}_\textit{k}}\right] ^{T}.$$
(15)

Finally, the third part \({\varvec{l}}^{\text {glo}}_{\textit{V},0}\) consists of the, in total M, initial vertices of the E model planesFootnote 7:

$$\begin{aligned} {\varvec{l}}^{\text {glo}}_{\textit{V},0} = \left[ {{\varvec{V}}}^{\text {glo}}_{1,0};\ldots ;\;{{\varvec{V}}}^{\text {glo}}_{m,0} ;\ldots ;\;{{\varvec{V}}}^{\text {glo}}_{\textit{M,0}}\right] \end{aligned}$$
(16)

with

$$\begin{aligned} {{\varvec{V}}}^{\text {glo}}_{m,0}=\left[ \textit{V}^{\text {glo}}_{\textit{x,m,0}},\textit{V}^{\text {glo}}_{\textit{y,m,0}},\textit{V}^{\text {glo}}_{\textit{z,m,0}}\right] ^{T} \end{aligned}$$
(17)

representing a 3D point which is the vertex of at least one plane.

The observation vector is arranged as follows:

$$\begin{aligned} {\varvec{l}}_{\textit{k}} = \left[ {{\varvec{l}}^{\text {loc}}_{{\text{Scan}},k}};\;{{\varvec{l}}^{\text {glo}}_{{\text{Pose}},k}};\;{{\varvec{l}}^{\text {glo}}_{\textit{V},0}}\right] ^{}. \end{aligned}$$
(18)

We apply Eq. (19) for building the stochastic model of the observation vector. We neglect correlations in all cases for a better discussion of the mainly important issues:

$$\begin{aligned} {\varvec{\Sigma }}_{\textit{ll,k}}& = \text {diag}\left( \underbrace{{\sigma }_{\text{{LS}}}^{2}, \ldots ,{\sigma }_{\text{{LS}}}^{2}}_{3\cdot \textit{N}},\underbrace{{\varvec{\sigma }}_{\textit{t}}^{2}, {\varvec{\sigma }}_{\textit{o}}^{2}}_{6},\underbrace{ {\sigma }_{\textit{V}}^{2}, \ldots ,{\sigma }_{\textit{V}}^{2}}_{3 \cdot \textit{M}}\right) . \end{aligned}$$
(19)

The standard deviation of a laser scanner coordinate component is denoted by \(\sigma _{{\text{LS}}}\). \({\varvec{\sigma }}_{\textit{t}}=\left[ {\sigma }_{\textit{t}},{\sigma }_{\textit{t}},{\sigma }_{\textit{t}}\right]\) and \({\varvec{\sigma }}_{o} = \left[ {\sigma _{o} ,\sigma _{o} ,\sigma _{o} } \right]\) denote the standard deviation of the GNSS and IMU observations, and \(\sigma _V\) denotes the standard deviation of the initial vertices.

2.4.3 System Equation

Formulating the system equation of the kind:

$$\begin{aligned} {\varvec{x}}_{k} = {\varvec{f}}\left( {\varvec{x}}_{k-1},{\varvec{u}}_{k-1}\right) +{\varvec{w}}_{k-1}, \end{aligned}$$
(20)

we neglected the control vector \({\varvec{u}}\) by setting it to zero. The system noise is normally distributed with \({\varvec{w}}_{k-1}\sim N({\varvec{0}},{\varvec{\Sigma }}_{ww})\). Equation (20) can be formulated as:

$$\begin{aligned} {\varvec{x}}_{k} = {\varvec{F}}_{x,k}\cdot {\varvec{x}}_{k-1} +{\varvec{w}}_{k-1}, \end{aligned}$$
(21)

where \({\varvec{F}}_{x,k}\) denotes the transition matrix:

$$\begin{aligned} {\varvec{F}}_{x,k}=\partial {\varvec{f}}/\partial {\varvec{x}}|_{{\hat{{\varvec{x}}}}_{k-1},{\varvec{u}}_{k-1}}. \end{aligned}$$
(22)

Due to the fact that we do not intend to develop an optimal system model within this paper, we applied rather simple linear system equations. Nevertheless, a subsequent adaption of the system model with more complex system equations is easily possible if the future flight characteristics of the UAS require it. For the system equations chosen, the transition matrix \({\varvec{F}}_{x,k}\) is given by:

$$\begin{aligned} {\varvec{F}}_{x,k}&=\begin{bmatrix} {{\varvec{F}}_{x,{\text{State}},k}}_{[9\times 9]}&\quad{\varvec{0}}_{[9\times 4\cdot E]}&\quad{\varvec{0}}_{[9\times 3\cdot M]}\\ {\varvec{0}}_{[4\cdot E\times 9]}&\quad{\varvec{I}}_{[4\cdot E\times 4\cdot E]}&\quad{\varvec{0}}_{[4\cdot E\times 3\cdot M]}\\ {\varvec{0}}_{[3\cdot M\times 9]}&\quad{\varvec{0}}_{[3\cdot M\times 4\cdot E]}&\quad{\varvec{I}}_{[3\cdot M\times 3\cdot M]} \end{bmatrix}, \end{aligned}$$
(23)

with

$${\varvec{F}}_{{x,{\text{State}},k}} = \left[ {\begin{array}{*{20}c} {{\varvec{I}}_{{[3 \times 3]}} } &\quad {{\varvec{0}}_{{[3 \times 3]}} {\text{diag}}(\left[ {\Delta \tau ,\Delta \tau ,\Delta \tau } \right])_{{[3 \times 3]}} } \\ {{\varvec{0}}_{{[6 \times 3]}} } &\quad {{\varvec{I}}_{{[6 \times 6]}} } \\ \end{array} } \right].$$
(24)

\(\Delta \tau\) is the time period between two consecutive epochs.

We apply the following equations for building the variance–covariance matrix (VCM) of the system noise \({\varvec{\Sigma }}_{ww}\):

$$\begin{aligned} {\varvec{\Sigma }}_{ww,{\text{State}}}&= \text {diag}\left( {\varvec{\sigma }}_{t,w}^{2},{\varvec{\sigma }}_{o,w}^{2},{\varvec{\sigma }}_{v,w}^{2}\right) \end{aligned}$$
(25)
$$\begin{aligned} {\varvec{\Sigma }}_{ww,{\text{Plane}}}&=\text {diag}\left( {\varvec{\sigma }}_{n,w}^{2},\sigma _{d,w}^{2},\ldots ,{\varvec{\sigma }}_{n,w}^{2},\sigma _{d,w}^{2}\right) \end{aligned}$$
(26)
$$\begin{aligned} {\varvec{\Sigma }}_{ww,V}&=\text {diag}\left( \sigma _{V,w}^{2},\ldots ,\sigma _{V,w}^{2}\right) \end{aligned}$$
(27)
$$\begin{aligned} {\varvec{\Sigma }}_{ww}&= \begin{bmatrix} {\varvec{\Sigma }}_{ww,{\text{State}}}&\quad{\varvec{0}}&\quad{\varvec{0}} \\ {{\varvec{0}}}&\quad{\varvec{\Sigma }}_{ww,{\text{Plane}}}&\quad{\varvec{0}}\\ {\varvec{0}}&\quad{\varvec{0}}&\quad{\varvec{\Sigma }}_{ww,V} \end{bmatrix}. \end{aligned}$$
(28)

The standard deviations of the system noise for translation, orientation, and velocity are denoted by \({\varvec{\sigma }}_{t,w}=\left[ \sigma _{t,w},\sigma _{t,w},\sigma _{t,w}\right]\), \({\varvec{\sigma }}_{o,w}=\left[ \sigma _{o,w},\sigma _{o,w},\sigma _{o,w}\right]\) and \({\varvec{\sigma }}_{v,w}=\left[ \sigma _{v,w},\sigma _{v,w},\sigma _{v,w}\right]\). The system noise for the plane parameters and vertices is given by the standard deviations \({\varvec{\sigma }}_{n,w}=\left[ \sigma _{n,w},\sigma _{n,w},\sigma _{n,w}\right]\), \(\sigma _{d,w}\) and \(\sigma _{V,w}\).

2.4.4 Measurement Equation

With the state parameter and observation vector given in Sects. 2.4.1 and 2.4.2, we can formulate three types of nonlinear implicit measurement equations. The first type of measurement equation \({\varvec{h}}_{\mathrm {I}}\) causes that each transformed LSC \({\varvec{l}}^{\text {glo}}_{{\text{Scan}},e,k}\) has to be located in the assigned plane of the city model represented by \({\varvec{n}}_{e}\) and \(d_{e}\). If the point \({\varvec{p}}^{\text {loc}}_{e,i,k}=\left[ X^{\text {loc}}_{e,i,k},Y^{\text {loc}}_{e,i,k},Z^{\text {loc}}_{e,i,k}\right] ^T\) is in \({\varvec{l}}^{\text {loc}}_{{\text{Scan}},e,k}\), which consists of all points assigned to plane e of the 3D city model, the following measurement equation is applied:

$$\begin{aligned} h_{\mathrm {I},i}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right)&:{\varvec{n}}_{e}^{T}\cdot \underbrace{\left[ {\varvec{t}}_{k}+{\varvec{R}}_{k}\cdot E({\varvec{p}}^{\text {loc}}_{e,i,k})\right] }_{{\varvec{p}}^{\text {glo}}_{e,i,k}}-d_{e}=0, \end{aligned}$$
(29)

with \(i\in \{1\ldots N\}\).

The rotation matrix \({\varvec{R}}_k\) is calculated based on the current orientation \({\varvec{o}}_k\) according to Eqs. (2)–(5). Subsequently, we obtain N measurement equations of type \({\varvec{h}}_{\mathrm {I}}\):

$$\begin{aligned} {\varvec{h}}_{\mathrm {I}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right)&=\left[ h_{\mathrm {I},1}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) , \ldots ,h_{\mathrm {I},N}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) \right] ^{T}. \end{aligned}$$
(30)

Figure 5 shows an illustration of the relationship between state parameters, observations, and measurement equations.

Fig. 5
figure 5

Example of the relationship between state parameters, observations, and measurement equations in an arbitrary epoch k: The \(N_1=26\) blue points \({\varvec{p}}_{1,1,k}^{\text{glo}}\) to \({\varvec{p}}_{1,26,k}^{\text{glo}}\) are assigned to plane 1. Subsequently, \({\varvec{l}}_{{\text{Scan}},1,k}^{\mathrm {loc}}\) consists of these blue points in local sensor coordinate system. The \(N_2=22\) purple points \({\varvec{p}}_{2,1,k}^{\text{glo}}\) to \({\varvec{p}}_{2,22,k}^{\text{glo}}\) are assigned to plane 2. Subsequently, \({\varvec{l}}_{{\text{Scan}},2,k}^{\mathrm {loc}}\) consists of these purple points in local sensor coordinate system. If we assume that there are only those \(E=2\) planes in the 3D city model, the observation vector \({\varvec{l}}_{{\text{Scan}},k}^{\mathrm {loc}}\) has the dimension \([144\times 1]\). According to Eq. (14), N is equal to 48. The state parameter vector \({\varvec{x}}_{V,k}\) consists of the six vertices \({\varvec{V}}_{1,k}^{\text{glo}}{}\) to \({\varvec{V}}_{6,k}^{\text{glo}}\) (in global coordinate system) and has the dimension \([18\times 1]\). The vector \({\varvec{l}}_{V,0}^{\text{glo}}\) has the same dimension. With solely two planes in the 3D city model, \({\varvec{x}}_{{\text{Plane}},k}\) has the dimension \([8\times 1]\)

In the second type of measurement equation \({\varvec{h}}_{\mathrm {II}}\), we calculate the difference between the estimated pose \(\left[ {\varvec{t}}_{k};{\varvec{o}}_{k}\right]\) and observed pose \({\varvec{l}}^{\text {glo}}_{{\text{Pose}},k}\) of GNSS and IMU:

$${\varvec{h}}_{\text {II}}\left( E({\varvec{l}}_{k}),\,{\varvec{x}}_{k}\right):\left[ {\varvec{t}}_{k};\,{\varvec{o}}_{k}\right] -E({\varvec{l}}^{\text {glo}}_{{\text{Pose}},k})={\varvec{0}}.$$
(31)

If the GNSS and IMU data are available, we obtain six measurement equations of type \({\varvec{h}}_{\mathrm {II}}\). When the GNSS signal is missing, the measurement equation changes to:

$$\begin{aligned} {\varvec{h}}_{\mathrm {II}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right)&: {\varvec{o}}_{k}-{E({\varvec{o}}^{\text {IMU}}_{k})}={\varvec{0}}, \end{aligned}$$
(32)

and we only obtain three measurement equations of this type.

The third type of measurement equation \({\varvec{h}}_{\mathrm {III}}\) is introduced to avoid a datum defect. It computes the difference between the estimated vertices in \({\varvec{x}}_{V,k}\) and the initial vertices in \({\varvec{l}}^{\text {glo}}_{V,0}\). If the vertex \({\varvec{V}}^{\text {glo}}_{m,k}\) is stored in \({\varvec{x}}_{V,k}\) and \({\varvec{V}}^{\text {glo}}_{m,0}\) is its initial observation stored in \({\varvec{l}}^{\text {glo}}_{V,0}\), the following measurement equations are applied:

$$\begin{aligned} {\varvec{h}}_{\mathrm {III},m}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right)&: {\varvec{V}}^{\text {glo}}_{m,k}-E({\varvec{V}}^{\text {glo}}_{m,0})={\varvec{0}}, \end{aligned}$$
(33)

with \(m\in \{1\ldots M\}\).

For a three-dimensional vertex Eq. (33) consists of three measurement equations. Subsequently, we obtain \(3\cdot M\) measurement equations of type \({\varvec{h}}_{\mathrm {III}}\):

$$\begin{aligned} {\varvec{h}}_{\mathrm {III}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right)&= \begin{bmatrix} {\varvec{h}}_{\mathrm {III},1}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) ^{}\\ \vdots \\ {\varvec{h}}_{\mathrm {III},M}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) ^{} \end{bmatrix}. \end{aligned}$$
(34)

Altogether, the measurement equations are given by:

$$\begin{aligned} {\varvec{h}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right)&=\begin{bmatrix} {\varvec{h}}_{\mathrm {I}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) \\ {\varvec{h}}_{\mathrm {II}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) \\ {\varvec{h}}_{\mathrm {III}}\left( E({\varvec{l}}_{k}),{\varvec{x}}_{k}\right) \end{bmatrix}. \end{aligned}$$
(35)

2.4.5 Nonlinear Equality Constraint for the State Parameters

We apply two types of nonlinear equality constraints for the state parameters in our IEKF. The prior information which we want to model describes hard constraints which have to be fulfilled.

The first type of nonlinear equality constraint \({\varvec{g}}_{\mathrm {I}}\) arises due to the fact that we are using plane parameters within the state parameter vector \({\varvec{x}}_{k}\). Here, we must ensure unit normal vectors by means of a length of one. For this, we can make use of the nonlinear equality constraints:

$$\begin{aligned} g_{\mathrm {I},e}\left( {\varvec{x}}_{k}\right)&=||{\varvec{n}}_{e}||=\sqrt{n_{e_x}^2+n_{e_y}^2+n_{e_z}^2}= b_{\mathrm {I},e}=1, \end{aligned}$$
(36)

with \(e\in \{1\ldots E\}\).

The nonlinear equality constraint of type \({\varvec{g}}_{\mathrm {I}}\) applies to each of the E planes normal vectors:

$$\begin{aligned} {\varvec{g}}_{\mathrm {I}}\left( {\varvec{x}}_{k}\right)&= \left[ g_{\mathrm {I},1}\left( {\varvec{x}}_{k}\right) ,\ldots ,g_{\mathrm {I},E}\left( {\varvec{x}}_{k}\right) \right] ^{T}. \end{aligned}$$
(37)

The right side of the equal sign is stored in vector \({\varvec{b}}_{\mathrm {I}}\):

$$\begin{aligned} {\varvec{b}}_{\mathrm {I}}&=\left[ b_{\mathrm {I},1},\ldots , b_{\mathrm {I},E}\right] ^{T}. \end{aligned}$$
(38)

With the second type of nonlinear equality constraint \({\varvec{g}}_{\mathrm {II}}\), we ensure that each vertex of a plane is located in the plane. By this, we preserve the topology of the 3D city model. If we assume that \({{\varvec{V}}}^{\text {glo}}_{\textit{m,k}}\) is a vertex of plane e, represented by \({\varvec{n}}_{e}\) and \(d_{e}\), the following equality constraint is applied:

$$\begin{aligned} g_{\mathrm {II},i}\left( {\varvec{x}}_{\textit{k}}\right)&={{\varvec{n}}}_{\textit{e}}^{T}\cdot {{\varvec{V}}}^{\text {glo}}_{\textit{m,k}}-\textit{d}_{\textit{e}}={\varvec{b}}_{\mathrm {II},i}=0 \end{aligned}$$
(39)

with \(i\in \{1\ldots 4\cdot E\}\).

This type of nonlinear equality constraint must be fulfilled for each vertex of all planes. In general, 3D city model planes have 4 vertices, but it is also possible for them to have 3 or more than 4 vertices. Thus, the number \(4\cdot E\) is just a rough estimate for the total amount of equality constraints of type \({\varvec{g}}_{\mathrm {II}}\):

$$\begin{aligned} {\varvec{g}}_{\mathrm {II}}\left( {\varvec{x}}_{k}\right)&=\left[ g_{\mathrm {II},1}\left( {\varvec{x}}_{k}\right) ,\ldots ,g_{\mathrm {II},4\cdot E}\left( {\varvec{x}}_{k}\right) \right] ^{T}. \end{aligned}$$
(40)

The right side of the equal sign is stored in vector \({\varvec{b}}_{\mathrm {II}}\):

$$\begin{aligned} {\varvec{b}}_{\mathrm {II}}&=\left[ b_{\mathrm {II},1},\ldots ,b_{\mathrm {II},4\cdot E}\right] ^{T}. \end{aligned}$$
(41)

A comparable procedure can be found in Unger et al. (2016). Altogether, the nonlinear equality constraints for the state parameters are given by:

$$\begin{aligned} {\varvec{g}}\left( {\varvec{x}}_{k}\right)&=\left[ {\varvec{g}}_{\mathrm {I}}\left( {\varvec{x}}_{k}\right);\,{\varvec{g}}_{\mathrm {II}}\left( {\varvec{x}}_{k}\right) \right] ^{} \end{aligned}$$
(42)
$$\begin{aligned} {\varvec{b}}&=\left[ {\varvec{b}}_{\mathrm {I}};\,{\varvec{b}}_{\mathrm {II}}\right] ^{}. \end{aligned}$$
(43)

2.4.6 Initialization

The initial state parameter vector \({\varvec{x}}_{0}\) is created by the GNSS and IMU observations. We assume zero for the velocity in each coordinate component:

$$\begin{aligned} {\varvec{v}}_{0}&={\varvec{0}}_{[3\times 1]}. \end{aligned}$$
(44)

The initial plane parameters \({\varvec{n}}_{e,0}\) and \(d_{e,0}\) were estimated from the planes’ vertices stored in the city model by the Drixler algorithm (Drixler 1993). We make use of the vertices of the 3D city model given for the initial vertices \({\varvec{x}}_{V,0}\):

$$\begin{aligned} {\varvec{x}}_{{\text{State}},0}&=\left[ {\varvec{t}}_{{\text{GNSS}},0};\,{\varvec{o}}_{{\text{IMU}},0};\,{\varvec{v}}_{0}\right] \end{aligned}$$
(45)
$$\begin{aligned} {\varvec{x}}_{{\text{Plane}},0}&=\left[ {\varvec{n}}_{1,0};\,d_{1,0};\ldots ;\,{\varvec{n}}_{E,0};d_{E}\right] \end{aligned}$$
(46)
$$\begin{aligned} {\varvec{x}}_{V,0}&=\left[ {\varvec{V}}^{\text {glo}}_{1,0};\ldots ;\,{\varvec{V}}^{\text {glo}}_{M,0}\right] \end{aligned}$$
(47)
$$\begin{aligned} {\varvec{x}}_{0}&=\left[ {\varvec{x}}_{{\text{State}},0};\,{\varvec{x}}_{{\text{Plane}},0};\,{\varvec{x}}_{V,0}\right] ^{}. \end{aligned}$$
(48)

We apply the following equations for building the VCM of the initial state parameter vector \({\varvec{\Sigma }}_{xx,0}\):

$$\begin{aligned} {\varvec{\Sigma }}_{xx,{\text{State}},0}&=\text {diag}\left( {\varvec{\sigma }}_{t,0}^{2},{\varvec{\sigma }}_{o,0}^{2},{\varvec{\sigma }}_{v,0}^{2}\right) \end{aligned}$$
(49)
$$\begin{aligned} {\varvec{\Sigma }}_{xx,{\text{Plane}},0}&=\text {diag}\left( {\varvec{\sigma }}_{n,0}^{2},\sigma _{d,0}^{2},\ldots ,{\varvec{\sigma }}_{n,0}^{2},\sigma _{d,0}^{2}\right) \end{aligned}$$
(50)
$$\begin{aligned} {\varvec{\Sigma }}_{xx,V,0}&=\text {diag}\left( \sigma _{V,0}^{2}\ldots ,\sigma _{V,0}^{2}\right) \end{aligned}$$
(51)
$$\begin{aligned} {\varvec{\Sigma }}_{xx,0}&= \begin{bmatrix} {\varvec{\Sigma }}_{xx,{\text{State}},0}&{\varvec{0}}&{\varvec{0}}\\ {\varvec{0}}&{\varvec{\Sigma }}_{xx,{\text{Plane}},0}&{\varvec{0}} \\ {\varvec{0}}&{\varvec{0}}&{\varvec{\Sigma }}_{xx,V,0} \end{bmatrix}. \end{aligned}$$
(52)

The standard deviations of the initial state parameters of translation, orientation, and velocity are denoted by \({\varvec{\sigma }}_{t,0}=\left[ \sigma _{t,0},\sigma _{t,0},\sigma _{t,0}\right]\), \({\varvec{\sigma }}_{o,0}=\left[ \sigma _{o,0},\sigma _{o,0},\sigma _{o,0}\right]\), and \({\varvec{\sigma }}_{v,0}=\left[ \sigma _{v,0},\sigma _{v,0},\sigma _{v,0}\right]\). The standard deviations of the initial plane parameters and vertices are denoted by \({\varvec{\sigma }}_{n,0}=\left[ \sigma _{n,0},\sigma _{n,0},\sigma _{n,0}\right]\), \(\sigma _{d,0}\) and \(\sigma _{V,0}\).

2.5 Workflow

The workflow of our algorithm is summarized in Algorithm 1. This is an adaption of the algorithm proposed in Vogel et al. (2018). We highlighted the cross references to equations and sections in blue. The partial derivatives (see lines 9, 22, 23, and 40) were obtained once using the symbolic Math toolbox of \(\text {Matlab}^{\copyright }\). Subsequently, we implemented the partial derivatives in a function. Another possibility to obtain the partial derivatives could be the use of INTLAB (Rump 1999). We fixed the stop criterion \(c_{\mathrm {stop}}\) to \(1\times 10^{-12}\) in the measurement update step. This stop criterion was reached after an average of six iterations in the subsequent simulation.

figure a

3 Simulation

Computer simulations are a great tool for analyzing and interpreting engineering systems. Here, we intended to evaluate the functionality and performance of the IEKF implemented. Therefore, we focused on two scenarios which basically conform to the testing environment designated for the UAS Project. Furthermore, we focused on the challenging initialization phase of the IEKF and analyzed a rather short trajectory of 2.5 m length and with 50 epochs.

Therefore, we created a model of a fictitious building and a ground plane. Subsequently, we determined a fictitious trajectory beside this building. We assumed a constant velocity for the UAS of 1 m s−1 and simulated laser scanner measurements with a frequency of 20 Hz. Consequently, the epochs simulated are at constant distances of 0.05 m. Each of the epochs consists of a laser scan and the desired values of position and orientation obtained from the determined trajectory at a certain time.

Figures 6, 7 and 8 depict different views of the building and ground plane created. The building created contains roof planes (in dark grey), wall planes (in red), and window planes (in transparent blue). Figure 9 depicts the corresponding 3D city model of the building. As we can see, the ground plane is not part of the 3D city model. The 3D city model contains the vertices of each plane and is given in the superordinate coordinate system. The simulated and correctly georeferenced point cloud (blue dots) of the first epoch and the trajectory of the UAS (magenta line) is depicted in each figure.

Fig. 6
figure 6

Top view of the simulation environment with a simulated point cloud (blue dots) of the first epoch and trajectory of the UAS (magenta line)

Fig. 7
figure 7

Side view of the simulation environment with a simulated point cloud (blue dots) of the first epoch and trajectory of the UAS (magenta line)

Fig. 8
figure 8

Oblique view of the simulation environment with a simulated point cloud (blue dots) of the first epoch and trajectory of the UAS (magenta line)

Fig. 9
figure 9

3D city model with a simulated point cloud (blue dots) of the first epoch and trajectory of the UAS (magenta line)

3.1 Scenarios

We simulated two scenarios in which the laser scanner and IMU measurements were generated differently under certain assumptions regarding measurement accuracy and bias. We repeated the simulation 500 times (\(S=500\)) for both scenarios. In scenario 1, we only added normally distributed noise on the laser scanner measurements, the position (representing the GNSS receiver), and the orientation (representing the IMU). In scenario 2, we systematically perturbed the laser scanner measurements, hitting the windows of the simulated building, and increased the (standard deviation of the) noise for laser scanner measurements, hitting the windows and the ground. The systematic and increased random disturbances of the measurements hitting the windows simulate the infiltration behavior for glass. We included the increased measurement uncertainty and the actual infiltration of the laser into the increased noise of these measurements. The increased noise for measurements hitting the ground simulates possible unevenness and vegetation. Thus, this increased noise is more justified by actual structures in the object space than by an increased uncertainty of the measurement. Furthermore, we added a drift on the \(\kappa\)-component of the IMU. Therefore, we linked the systematic effect \(\Delta\) to the epoch number \(k\in \{1,\ldots ,K\}\). The \(\kappa\)-component conforms to the heading of the UAS. All assumed systematic effects \(\Delta\) and standard deviations \(\sigma\) of the added noise are depicted in Table 1. Basically, the assumptions of scenario 1 should be consistent with the manufacturer’s specifications of the sensors used or planned for our UAS. The assumptions in scenario 2, especially the ones for the laser scanner, are based on experience from test measurements. In our simulation, we assume that the positions and orientations of all sensors in a platform coordinate system or body frame have already been determined in a calibration process. We also assume that the sensors are properly synchronized. To neglect the effect of generalization in the 3D city model, we used the same model for simulation and in the following IEKF algorithm.

Table 1 Assumed systematic effects \(\Delta\) and standard deviations \(\sigma\) of the added noise for all sensors and observations (Obs.)

3.2 Values Chosen for the IEKF

We used a constant distance threshold \(d_{{{\text{assign}}}} = 0.3\,{\text{m}}\) for the assignment algorithm (see Sect. 2.3) in our simulation. This chosen threshold is a trade-off between the rather imprecise pose in the first epochs and the more precise pose in the last epochs.

The standard deviations used for building the initial VCM \({\varvec{\Sigma }}_{xx,0}\) (see Eqs. 4952) are depicted at the top of Table 2. The standard deviations of the initial position \(\sigma _{t,0}\) and orientation \(\sigma _{o,0}\) conform to the standard deviations of the GNSS and IMU observations. We assume a standard deviation for the initial velocity \(\sigma _{v,0}\) which includes the planned maximal velocities of the UAV. For the standard deviations of the initial plane parameters \(\sigma _{n,0}\) and \(\sigma _{d,0}\), we met the assumption that these values should be very small. Thus, they correspond roughly to the standard deviation of the initial vertices \(\sigma _{V,0}\), which, in turn, conform to the standard deviation chosen for the observation vector. If the accuracies of the planes’ vertices are known, the standard deviations of the plane parameters could also be determined by means of variance propagation.

The standard deviations chosen for the system noise (see Eqs. 2528) can be found in the middle of Table 2. We set the standard deviation for translation \(\sigma _{t,w}\), orientation \(\sigma _{o,w}\) and velocity \(\sigma _{v,w}\) depending on the time period \(\Delta \tau\) between two epochs and regarding possible unpredictable movements of the UAS in this time period. We set the system noise to zero for \({\varvec{\Sigma }}_{ww,{\text{Plane}}}\) and \({\varvec{\Sigma }}_{ww,V}\), because these parameters are constant over time. Consequently, \(\sigma _{n,w}\), \(\sigma _{d,w}\), and \(\sigma _{V,w}\) are zero.

We used the simulated standard deviation \(\sigma _{\text{LS}}=0.02\,\hbox {m}\) of the laser scanner measurements for the stochastic model of the observation vector (see Eq. 19). We also introduced the simulated standard deviations \(\sigma _{t}=0.5\,{\text{m}}\) and \(\sigma _{o}=0.2^{\circ}\) for the GNSS and IMU observations. In the case of the initial vertices of the 3D building model, we introduced a small standard deviation of \(\sigma _{V}=0.1\,{\text{mm}}\). The 3D city model is, thus, almost introduced as a fixed feature. We are aware that this assumption is too optimistic and that the accuracy of the introduced 3D city model is rather in the range of a few centimeters. However, in the context of this paper, the influence of a less precise 3D city model will not be considered in more detail.

We assume that \({\varvec{\Sigma }}_{ll,k}\) and \({\varvec{\Sigma }}_{ww}\) do not contain correlations on their own and among themselves.

Table 2 Assumed standard deviations \(\sigma\) for the initial VCM of the state parameters \({\varvec{\Sigma }}_{xx,0}\), the system noise \({\varvec{\Sigma }}_{ww}\), and the observation vector \({\varvec{\Sigma }}_{ll}\)

4 Results and Discussion

In the following, we depict the results of the IEKF for both scenarios and compare them with the results of a linear Kalman filter (LKF), which only uses the simulated GNSS and IMU observations. The comparison between LKF and IEKF evaluates the benefit of introducing laser scanner observations and a city model. The state parameter vector \({\varvec{x}}_{k}\) of the LKF solely consists of \({\varvec{x}}_{{\text{State}},k}\) (Eq. 6). The observation vector \({\varvec{l}}_k\) consists of \({\varvec{l}}^{\mathrm {glo}}_{{\text{Pose}},k}\) (Eq. 15). The transition matrix \({\varvec{F}_{x,k}}\) of the LKF is given by \({\varvec{F}}_{x,{\text{State}},k}\) (Eq. 24). Consequently, the VCM of the observation vector and of the system noise have to be adapted. The LKFs’ only remaining measurement equation (Eq. 31) has to be transformed in explicit form. For the LKF, no linearization is necessary and no constraints are used. For more information concerning the LKF, see (Simon 2010). Figures 10, 11, 12, 13, 14 and 15 depict the median values (solid lines) and 68%(\(1\sigma\)) confidence intervals (CI) (dashed lines) for the translation parameters \(t_{x}\), \(t_{y}\), \(t_{z}\) (Figs. 10, 11, 12) and orientation parameters \(\omega\), \(\phi\), \(\kappa\) (Figs. 13, 14, 15) in scenario 1. The CI are calculated numerically using the total number of simulated runs S according to Alkhatib et al. (2009). The red lines result from the filtered state parameters \({\hat{{\varvec{x}}}}_{\textit{k}}^{+}\) of the IEKF after the constraint step. The blue lines result from the filtered state parameters \({\hat{{\varvec{x}}}}_{\textit{k}}^{+}\) of the LKF, respectively. The true values are plotted in a dashed black line.

The median of the IEKF result for \(t_{x}\) (across flight direction) in Fig. 10 is very close to the true translation, whereas we can see slight random deviations between the median of the LKF results and the true values. The latter can be seen for each pose component and is not explicitly mentioned again in the following. The CI of the IEKF and the LKF shrink rapidly in the first 5 epochs. Whereas the CI of the IEKF is very close to the median, the CI of the LKF is significantly broader. The latter can be seen for each pose component and is also not explicitly mentioned again. The IEKF performs well for that pose component because of the measurement constellation. Each measurement distinctively assigned to a plane provides information for that pose component. In other words, each measured plane of the 3D city model is sensitive for that pose component.

Fig. 10
figure 10

Median and confidence intervals of \(t_{x}\) in scenario 1

The results are similar for \(t_{y}\) (in flight direction) in Fig. 11. Again, the median of the IEKF results is very close to the true translation and its CI shrinks within the first epochs. For this pose component, the shrinking of the CI (IEKF) is slightly slower than for \(t_x\) and lasts until epoch 10. A reason for that slight difference can be found in the weaker measurement constellation for that pose component. Only the triangle-shaped protrusions are sensitive for \(t_{y}\). Consequently, there are significantly fewer observations which provide information for that pose component.

Fig. 11
figure 11

Median and confidence intervals of \(t_{y}\) in scenario 1

Figure 12 shows the results for \(t_{z}\) (up-direction). The median of the IEKF results is very constant, but seems to deviate from the true values by a small systematic shift of about 3 cm. This systematic shift is caused by ground points wrongly assigned to a wall plane of the building. This results in a slight rotation around the y-axis (see results for \(\phi\)) in combination with a slight descent of \(t_z\). Again, the CI of the IEKF translation shrinks rapidly within the first 6 epochs. The boundaries of the CI in the first epochs are not symmetrical around the median. They are slightly shifted to larger \(t_z\) values. That means that the IEKF converges slower towards the true translation with initial \(t_z\) values larger than the true translation. An initial \(t_z\) value larger than the true translation may cause significantly fewer assignments in the first epochs. Only the roof planes are sensitive for that pose component.

Fig. 12
figure 12

Median and confidence intervals of \(t_{z}\) in scenario 1

The results for \(\omega\), which represents the rotation around the x-axis, are shown in Fig. 13. After 10 epochs, the median of IEKF and LKF obtain comparable results, whereas the median of the IEKF seems to vary slightly more. The CI of the IEKF shrinks within the first 12 epochs. Similar to the \(t_z\) pose component, there is a shift of the IEKF CI towards larger \(\omega\) values. The only sensitive planes for that pose component are the triangle-shaped protrusions and the roof planes. This may explain the larger variations in the median of the IEKF orientation.

Fig. 13
figure 13

Median and confidence intervals of \(\omega\) in scenario 1

Figures 14, 15 show the results for \(\phi\) (rotation around y-axis) and \(\kappa\) (rotation around z-axis), respectively. We can see similar characteristics for these pose components. In both cases, the median of the IEKF results is constant but systematically shifted by a small value of about \(0.05^{\circ}\). As has already been mentioned for \(t_{z}\), the systematic shift in \(\phi\) is caused by wrongly assigned ground points. Because of the skewed alignment of the laser scanner, the wrongly assigned ground points also cause a slight rotation in \(\kappa\). All planes are sensitive for both pose components.

Fig. 14
figure 14

Median and confidence intervals of \(\phi\) in scenario 1

Fig. 15
figure 15

Median and confidence intervals of \(\kappa\) in scenario 1

For scenario 2, we depicted the results in a similar way in Figs. 16, 17, 18, 19, 20 and 21. In Fig. 16, we can see the results for \(t_{x}\). Similar to scenario 1, the median of the IEKF results comes very close to the true translation. The lower boundary of the CI (IEKF) is farther afield from the median than the upper boundary. It shrinks until epoch 14. An explanation for the one-sided enlarged CI (IEKF) is the systematically extended laser scanner measurements hitting the windows. With an initial \(t_x\) translation, which is farther afield from the building than the true translation, these systematically extended laser scanner measurements are wrongly assigned to the walls of the building. After some epochs, these false assignments seem to have disappeared. These effects are analyzed later on (see Fig. 29).

Fig. 16
figure 16

Median and confidence intervals of \(t_{x}\) in scenario 2

The median of the IEKF results for \(t_y\) in Fig. 17 is close to the true translation. The upper boundary of the CI (IEKF) shrinks until epoch 13, whereas the lower boundary of the CI (IEKF) clearly departs from the median from epoch 14 on. Again, these effects are analyzed later on (see Figs. 28 and 29).

Fig. 17
figure 17

Median and confidence intervals of \(t_{y}\) in scenario 2

We can see the results for \(t_z\) in scenario 2 in Fig. 18. The results are comparable to the ones obtained in scenario 1 for that pose component. A significant difference can be found in the upper boundary of the CI (IEKF), which converges to the median much slower than in scenario 1. In scenario 2, it does not converge until epoch 15. Again, initial \(t_z\) values larger than the true position cause significantly fewer assignments in the first epochs.

Fig. 18
figure 18

Median and confidence intervals of \(t_{z}\) in scenario 2

We observe comparable results to scenario 1 for the pose parameters \(\omega\) and \(\phi\) in Figs. 19 and 20. In both figures, we can observe that one boundary of the CI (IEKF) is farther afield from the median than the other and that the CI (IEKF) shrinks slower than in scenario 1.

Fig. 19
figure 19

Median and confidence intervals of \(\omega\) in scenario 2

Fig. 20
figure 20

Median and confidence intervals of \(\phi\) in scenario 2

The pose component \(\kappa\) is depicted in Fig. 21. The CI (IEKF) shrinks until epoch 12. In addition, we can see that the simulated drift in \(\kappa\) does not affect the IEKF results, whereas the median of the LKF departs from the true orientation continuously as simulated.

Fig. 21
figure 21

Median and confidence intervals of \(\kappa\) in scenario 2

A suitable value to evaluate the performances of the filters is the root-mean-square error (RMSE). In our simulation, the RMSE is the error between filtered state parameters and the true state parameters. The RMSE of the pose parameter \(t_x\) for run \(s\in \{1,\ldots ,S\}\) is calculated according to:

$${\text{RMSE}}_{{t_x,R.,s}} = \frac{1}{K}\sum _{k=1}^{K}\sqrt{\left( t_{x,s,k}-{\bar{t}}_{x,k}\right) ^2}.$$
(53)

\(t_{x,s,k}\) results from the filtered state parameter vector \({\hat{\mathbf{x }}}_k^{+}\) from the IEKF or the LKF, respectively. \({\bar{t}}_{x,k}\) are the true values. The calculation of the RMSE for the other pose parameters is carried out analogously.

Table 3 depicts the characteristic criterions: Minimum, maximum, mean, median, standard deviation (SD), and the lower bound (\(\downarrow\)) and upper bound (\(\uparrow\)) of the 68% and 95% CI of different RMSE. In the last row of Table 3, we display the rate of runs where the IEKF obtained a smaller RMSE than LKF and vice versa. Each criterion is divided into two rows regarding scenario 1 (above) and scenario 2 (below).

Table 3 RMSE of position and orientation in scenario 1 (first row) and 2 (second row)

Table 3 clearly shows that the IEKF obtains better results than the LKF. The median values especially are significantly smaller in each state parameter and each scenario. By contrast, the range of the IEKF results is significantly larger than the range of the LKF results, which can be seen in the SD and 95% CI values. A possible explanation can be found in a larger number of runs, where the IEKF is far away from the true values. We will subsequently call these runs failures. These failures can be caused by a disadvantageous initial pose parameter in the first epoch. By this, a large number of laser scanner measurements are assigned to wrong planes and the IEKF converges to a wrong pose. The interaction between initial pose parameters and performance of the IEKF is analyzed at the end of this section. The failures are also the explanation for the large discrepancy between the mean and median values.

To evaluate the performance of the IEKF over the epochs, we calculated a mean RMSE for each epoch over all S simulation runs. The calculation is as follows for the RMSE of \(t_x\) in epoch \(k\in \{1,\ldots ,K\}\):

$${\text{RMSE}}_{{t_x,{\text{Ep}}.,k}} = \frac{1}{S}\sum _{s=1}^{S} \sqrt{\left( t_{x,s,k}-{\bar{t}}_{x,k}\right) ^2}.$$
(54)

We obtain the RMSE for the other pose parameters analogously. The results can be found in Figs. 22, 23, 24, 25, 26 and 27.

Fig. 22
figure 22

\({\text{RMSE}}_{t_x,{\text{Ep}}.}\)

Fig. 23
figure 23

\({\text{RMSE}}_{t_y,{\text{Ep}}.}\)

Fig. 24
figure 24

\({\text{RMSE}}_{t_z,{\text{Ep}}.}\)

Fig. 25
figure 25

\({\text{RMSE}}_{\omega ,{\text{Ep}}.}\)

Fig. 26
figure 26

\({\text{RMSE}}_{\phi ,{\text{Ep}}.}\)

Fig. 27
figure 27

\({\text{RMSE}}_{\kappa ,{\text{Ep}}.}\)

Figures 22, 23, 24, 25, 26 and 27 generally support the interpretations made before. The RMSE of the translation parameters \(t_{x}\) and \(t_{z}\) determined by the IEKF decrease with an increasing number of epochs. At the last epoch, the \({\text{RMSE}}_{t_x,{\text{Ep}}.}\) and \({\text{RMSE}}_{t_z,{\text{Ep}}.}\) of the IEKF is significantly smaller than the RMSE obtained by LKF in both scenarios. Again, \({\text{RMSE}}_{t_y,{\text{Ep}}.}\) differs and increases after epoch 10 with an increasing number of epochs. In both scenarios, especially in scenario 2, the \({\text{RMSE}}_{t_y,{\text{Ep}}.}\) obtained by the IEKF is larger than the one obtained by LKF in the last epochs. The RMSE of the orientation parameter show different characteristics, but they are intrinsically quite similar. The RMSE obtained by IEKF starts with large values and significantly decreases after epoch 10. In the last epoch, \({\text{RMSE}}_{\phi ,{\text{Ep}}.}\) and \({\text{RMSE}}_{\kappa ,{\text{Ep}}.}\) of IEKF and LKF are quite similar. \({\text{RMSE}}_{\omega ,{\text{Ep}}.}\) of the IEKF undercuts the \({\text{RMSE}}_{\omega ,{\text{Ep}}.}\) of the LKF until epoch 9 (scenario 1), resp. epoch 30 (scenario 2). The \({\text{RMSE}}_{\kappa ,{\text{Ep}}.}\) of LKF in scenario 2 is clearly affected by the simulated IMU drift. It is remarkable that (except in \({\text{RMSE}}_{t_y,{\text{Ep}}.}\)) the RMSE of the IEKF between scenario 1 and scenario 2 is very similar in the last epoch.

For a deeper understanding of the link between initialization and performance of the IEKF, we plotted the initial values of GNSS and IMU of the first epoch in parallel coordinates. The parallel coordinate representation was first used by Inselberg (1985). For this representation, we will show all pose components in six parallel axes. On each axis, all initial values belonging to this component are depicted. This representation is used to identify whether some extreme initial pose values could have had a large impact on the divergence of the IEKF. In addition, we colored the runs where the IEKF failed (failure) in orange and the others (success) in blue. The distinction in failure and success was selected based on the RMSE of the position in the last epoch. The run is classified as a failure when there is a value larger than 10 cm.

In scenario 1 (Fig. 28), the failure rate is 7.6%. For the initial pose parameters \(t_z\), \(\omega\), \(\phi\), and \(\kappa\), the failures are distributed over the whole value range. For \(t_x\), the failures are slightly concentrated in a range between − 9.2 and − 8.55 m and a second range around − 10.6 m. For \(t_y\), the failures are even more concentrated in a range between 3.72 and 4.3 m. Initial values for \(t_y\) in a range between 3.72 and 4.3 m lead to most of the failures. As previously mentioned, we have a weak measurement constellation for \(t_y\). Furthermore, the trajectory is parallel to the y-axis with a true value of \(t_y = 7.5\,{\text{m}}\) in epoch 50, the consequence being that with initial values for \(t_y\) between the true value (5 m) and 6.19 m, the IEKF may get back to the true values in a later epoch. This possibility is not given for initial values for \(t_y\) between 3.72 and 4.3 m.

Fig. 28
figure 28

Initial pose values plotted as parallel coordinates in scenario 1. Total failure rate: 7.6%

In scenario 2, the failure rate of 20.4% is significantly higher. Again, for the initial pose parameters \(t_z\), \(\omega\), \(\phi\), and \(\kappa\), the failures are distributed over the whole value range. For \(t_y\), the failures are mainly in a range between 3.72 and 5 m. The reason for that has already been explained for scenario 1. For \(t_x\), we can see a strong concentration of failures in a range between − 11.37 and − 10.4 m. Thus, especially that range for \(t_x\) causes the failures in scenario 2. The reason for that can be found in the systematically perturbed laser scanner measurements hitting the windows. These measurements are extended by an additive value \(\Delta = 0.6\,{\text{m}}\). In combination with a disadvantageous initialization, these perturbed measurements are wrongly assigned to the wall plane and, therefore, cause the failures.

Fig. 29
figure 29

Initial values plotted as parallel coordinates in scenario 2. Total failure rate: 20.4%

We used GNSS and IMU observations in the measurement update step for all the displayed results in this section. We performed the same experiment without using the GNSS and IMU observations in the measurement update step. The differences in the results are insignificantly small. Therefore, we do not show these results here.

5 Conclusions and Outlook

In this paper, we presented the mathematical basics and the workflow of an IEKF which makes it possible to determine the trajectory of a UAS better than 5 cm in position (median value) and \(0.08^{\circ}\) for orientation (median value). It is not mandatory to have continuous GNSS and IMU observations for the implemented IEKF. The trajectory is mainly obtained using laser scanner measurements of building facades, which are modeled as planes in a 3D city model. The laser scanner measurements and the planes of the 3D city model are combined by implicit measurement equations and nonlinear equality constraints within the IEKF.

To demonstrate the functionality and performance of the IEKF implemented, we developed a simulator, which showed that the IEKF is even suited to handle systematically perturbed observations.

Nevertheless, the algorithm demonstrated may be tuned to deal with disadvantageous initial values. A possible starting point can be found in the threshold of the assignment algorithm. Instead of a constant value, this value should be varied regarding the estimated accuracy of the predicted position.

In the future, we plan to evaluate the performance of the IEKF based on real data, especially regarding effects of generalization of the 3D city model and effects of a 3D city model with larger uncertainties. Thus, we can also check whether the assumptions made for the simulation (see Table 1) are also valid in reality. In particular, the performance of the GNSS receiver can deviate strongly from the simulated data with the same precision.

In the IEKF presented, the plane parameters and vertices are introduced as parameters to preserve the topology of the 3D city model. With an increasing number of planes and vertices, the computation becomes inefficient. Here, we plan to reorganize the IEKF presented by means of a dual-state Kalman filter.

As has already been mentioned, we plan to integrate camera measurements into the IEKF. This enables the stabilization of the trajectory in the long term and the improvement of the IEKF’s initial behavior. Therefore, we have to integrate the collinearity equations into our approach. Subsequently, we have to integrate the object coordinates of the tie points into the state parameter vector. This emphasizes the need to use a dual-state Kalman filter. By integrating the camera measurementsFootnote 8 into the IEKF, we must deal with observations captured in different epochs: one or more images observing a tie point were taken in a past epoch and only one image is taken in the current epoch.

Finally, we aim to analyze the benefit of integrating further geometries in addition to the planes of the 3D city model. There are many cylindrical-shaped objects in urban areas, such as lantern, traffic lights or street signs, which may be used in the IEKF.