1 Introduction

Hyperspectral imaging (HSI) is a technique used to record electromagnetic signals in hundreds of narrow spectral bands. For this, linear pushbroom (LP) cameras are often used to store the radiometry in lines. Commonly, analysis of aerial HSI is based on a georeferenced and orthorectified image where raster pixels are of equal size and refer to a well-defined geographic coordinate system, i.e., an orthoimage. HSI orthoimages have a wide range of applications, including geological mapping (e.g., Kuras et al. 2022a; Ren et al. 2022), forestry (e.g., Trier et al. 2018; Allen et al. 2022), urban classification (e.g., Jonassen et al. 2019; Kuras et al. 2021, 2022b, 2023), and agriculture (e.g., Lu et al. 2020). Often, in-situ spectrometer measurements are used to link ground truth data to the orthoimages. This data fusion requires accurately georeferenced orthoimages to correctly identify the objects with measured spectra. The georeferenced orthoimages are produced based on the interior and exterior orientations of the camera at the time of each image exposure. Accurate estimates of the camera orientations can be provided from photogrammetric bundle adjustment. However, the exactness of the bundle adjustment is dependent on the fidelity of the functional model for the LP camera. In addition, a realistic stochastic model is important for reliable results and optimal estimation.

Even though the bundle-adjustment method is extensively tested and documented for frame cameras (e.g., Triggs et al. 2000; Förstner and Wrobel 2016), some issues arise when transferring the underlying methods to LP HSI. The problem of accurate bundle adjustment from LP HSI is threefold:

  1. 1.

    Retrieval of observations for the bundle adjustment: The LP image lines cannot readily be used for retrieval of observations for the bundle adjustment using standard key-point detectors and descriptors owing to the lack of spatial neighborhood in one of the image dimensions. Combinations of consecutive LP image lines, i.e., LP image scenes, are too heavily distorted to describe key points so that they can be identified in multiple LP image scenes when acquired from typical flight altitudes \(> 1850\) m above ground level (AGL) (Fig. 1). The use of active stabilization does not fully solve the issue of this distortion. Additionally, once key-point correspondences have been found in multiple images, they should be filtered for outlier removal, e.g., using RANSAC with epipolar geometry constraints. However, the standard model for a pinhole camera cannot be used for the LP image scenes in RANSAC and an alternative model thus needs to be used.

  2. 2.

    Over-parameterization in the bundle adjustment: The number of parameters is very high compared to the number of observations in the traditional bundle adjustment when using LP cameras. This happens because of the high-frequent image exposures and the need to estimate six parameters per LP image line, i.e., three for the position and three for the rotation angles. Thus, a method to reduce the number of parameters in the bundle adjustment is required.

  3. 3.

    Modeling the LP HSI camera: Normally, the chromatic aberration is neglected in the traditional bundle adjustment. Thus, a modification of the traditional bundle adjustment is needed so that the chromatic aberration in HSI can be addressed.

Fig. 1
figure 1

Relative orientation changes will lead to distorted LP image lines during the continuous image acquisition. The general direction of motion of the airplane is downwards in the figures. a Relative roll change, b Relative pitch change, c Relative heading change

We present a method to overcome these challenges where observations for the bundle adjustment are found and linked to single pixels in the LP image lines. Thus, the observations are independent of their pixel neighborhoods and other data sources such as a digital elevation model (DEM) or frame images. Furthermore, the proposed modified bundle adjustment offers a solution to the problem of sparsely distributed and few observations relative to the extensive number of image exposures from an LP camera. A general calibration method for the estimation of interior and exterior parameters in an LP HSI camera as well as trajectory corrections is examined. The main motivation for the work is to develop a technique to retrieve a sufficient number of observations for the bundle adjustment from an LP camera without being dependent on other data sources. The goal is to achieve accurate results applicable to high-quality orthoimage generation. No additional camera models beyond the standard pinhole camera model are introduced for bundle adjustment from LP HSI. The rigorous implementation of the pinhole camera model makes it possible to estimate corrections to the principal distance for the camera separately for different spectral bands to examine the chromatic aberration in the LP HSI camera. An experiment is shown to demonstrate the method and the resulting accuracy from the bundle adjustment. The trajectory model used in the bundle adjustment has previously been applied to hybrid adjustment from images and light detection and ranging (LiDAR) point clouds (e.g., Glira et al. 2019; Haala et al. 2022; Jonassen et al. 2023), making it a general model for future joint adjustment from 1D LP image lines, 2D frame images, and 3D LiDAR point clouds.

2 Related Work

Several techniques have been presented for geospatial correction of satellite LP image scenes (e.g., Poli and Toutin 2012; Sugimoto et al. 2018). However, limited research has been done on the topic for airborne or terrestrial vehicles where trajectory errors are less smooth, a‑priori calibrations worse (Lenhard 2015), and GSD is both smaller and more irregular. Even though the interior calibration of LP cameras has been studied (e.g., Lenhard et al. 2015), the mounting parameters of the camera, i.e., lever-arm and boresight angles, also affect the georeferencing accuracy, as well as errors in time-dependent exterior trajectory parameters (see Jonassen et al. 2023, for examples of parameters).

Recently, Kim et al. (2021) pointed out the need for more investigations on accurate bundle adjustment of airborne LP cameras.

2.1 Observation Retrieval for the Bundle Adjustment

Few methods have been presented for the automatic retrieval of observations for the bundle adjustment from airborne LP image lines. Hasheminasab et al. (2021) used a DEM to partially orthorectify the LP image scenes to provide observations for the bundle adjustment. Using this method, GCPs are not needed to correct for the temporal errors in LP time stamps, lever-arm between the camera and inertial measurement unit (IMU), or ground coordinates of the tie points. The clear disadvantage of using a DEM for observation retrieval is the added uncertainty connected to the DEM quality and the increased processing needed to generate the DEM.

2.2 Triangulation from Three-line Cameras

In the three-line camera systems, three image lines are acquired simultaneously (e.g., Sandau and Bärwald 1994; Heipke et al. 1996; Sandau et al. 2000; Chen et al. 2003, 2004). Images of these systems have successfully been used for triangulation by assuming identical orientations for the three lines in certain orientation images (e.g., Heipke et al. 1996; Tempelmann et al. 2000; Hinsken et al. 2002; Chen et al. 2004). The orientation images are separated with regular time intervals, and the camera orientations at distinct image exposures are retrieved by interpolation between the neighboring orientation images. The three-line camera setup significantly improves the number of observations in the bundle adjustment by providing three overlapping images within a single flight strip. However, this setup is not common for LP HSI cameras, possibly owing to the interior camera optics.

2.3 Co-registration of LP Image Scenes with Frame Images

Bundle adjustment from frame images has been a widely researched topic for decades. The accuracy of the pinhole model is well documented and the bundle adjustment using this model has been extensively tested (e.g., Jacobsen et al. 2010). Barbieux (2018) carried out a study to co-register LP image scenes projected onto a DEM with orthoimages from adjusted frame images as reference. Angel et al. (2019) studied the co-registration with an affine transformation between LP image scenes and orthoimages from a frame camera. Similarly, Jurado et al. (2021) did a homography calculation between the LP image scenes and the reference orthoimage.

The high-dimensional estimation problem is severely reduced using these methods, but the resulting accuracy of the LP orthoimage is dependent on the accuracy of the reference orthoimage and the fidelity of the model used to relate the two orthoimages. Thus, the techniques are only indirect solutions to the problem of bundle adjustment from LP image lines and require additional reference data. A method where observations are directly linked to the LP image lines would remove the need for additional frame images and provide exact observations for the bundle adjustment.

2.4 Adjustment from LP Image Lines

Habib et al. (2018) presented three bundle adjustment methods under the assumption that other parameters than boresight angles had a sub-pixel impact on the georeferencing of LP image lines. This assumption effectively limits the number of correlated parameters, but the under-parameterization was pointed out as a weakness of the methods. Kim et al. (2021) used one of these methods to include the estimation of principal distance and lever-arm offsets from an LP camera mounted on an unmanned aerial vehicle (UAV), resulting in an accuracy of \(2.3\) times the GSD.

These methods for adjustment from LP imaging focus only on estimating some of the parameters related to the largest errors and undermine the estimation of time-dependent trajectory parameters, i.e., errors in position and orientation stemming from the global navigation satellite system (GNSS) aided inertial navigation systems (INS) processing. These errors can in many cases be significant (e.g., Skaloud et al. 2010). The orientation errors of GNSS/INS in airborne systems increase with longer flight strips, mostly due to the limited observability of heading errors in the GNSS/INS for typical flight dynamics. This is further worsened by using stabilized platforms.

2.5 Modeling the Trajectory Corrections

Recently, corrections to the trajectory have been estimated in time intervals with equal length when doing the bundle adjustment (e.g., Glira et al. 2016; Jonassen et al. 2023). This offers reduced estimation complexity as the trajectory corrections are estimated in longer time intervals than in the standard bundle adjustment. Even though the trajectory positions and orientations typically vary significantly, their corrections are expected to be small and smooth. An assumption for the model is that high-frequency motion is captured by the INS and that only low-frequency trajectory errors remain after initial GNSS/INS processing. The most accurate results have been presented by correcting the low-frequency trajectory errors using a cubic spline trajectory correction model (Glira et al. 2016). Considering the captures of high-frequency exposures from LP cameras and the avoided need to estimate camera orientations per LP image line, this model is especially suitable for reducing the number of unknowns in the bundle adjustment from LP image lines. The model has been shown to provide state-of-the-art accurate results for adjustment from frame cameras and LiDAR scanners (Glira et al. 2019; Haala et al. 2022; Jonassen et al. 2023).

3 Methodology

All pixels within an LP image line are recorded simultaneously and consecutive 1D LP image lines can be combined to form 2D LP image scenes with larger image extents. In these LP image scenes, each LP image line is an individual image exposure (Fig. 2), while an LP image scene is associated with multiple spatiotemporally adjacent LP image lines (see Morgan 2004).

Fig. 2
figure 2

An LP image scene (red) is a mosaic of several consecutive LP image lines. LP image lines are separate exposures with respective time stamps (blue). The coordinate system is defined with the x‑axis towards the right and the y‑axis downwards in the figure (parallel to the flight direction). The image x‑axis is parallel with the view plane

The proposed method, roughly, consists of pre-processing, initialization, and iterative parameter estimation and correction. The three main steps of the pre-processing part are explained in detail in Sect. 3.1 and lead to the observations to be used in the bundle adjustment. The steps are summarized as:

  1. 1.

    GNSS/INS processing which results in a platform trajectory. This trajectory processing is done using a Kalman filter and fixed-interval smoother with the software TerraPos (see Kjørsvik et al. 2009). Consequently, a trajectory with associated error covariance matrices for both the position and orientation of the platform is computed. The approximate a‑priori camera lever-arm and boresight angles are known from initial measurements and relate the platform trajectory to the camera.

  2. 2.

    Computation of rotation compensation of LP image lines. The trajectory and camera boresight angles are used to compute rotation compensations for the LP image lines in the image x‑dimension.

  3. 3.

    Key-point detection, matching, and RANSAC filtering from rotation-compensated LP image scenes. The key points are detected and described in the LP image scene using the binary robust invariant scalable keypoints (Leutenegger et al. 2011). Furthermore, these key points are matched to find key-point correspondences that represent the same features in different LP image scenes. The rotation compensation is no longer considered once key-point correspondences have been filtered using RANSAC and the observations for the following bundle adjustment have been formed.

3.1 LP Observation Retrieval

The LP image line distortions originate from angular motion (Fig. 1). This motion is too high-frequent to be compensated for with a stabilized platform but is precisely captured by the IMU. The motion is compensated in the image x‑dimension to detect the same salient key points across several LP image scenes, using standard feature-based matching. An x‑coordinate shift is computed per pixel to form the rotation-compensated LP image scenes:

$$\boldsymbol{\delta}(t_{e})_{[i]}=\left(\boldsymbol{R}_{c}^{m}(t_{s})\right)^{-1}\boldsymbol{R}_{c}^{m}(t_{e})\begin{bmatrix}x_{[i]}^{c}\\ 0\\ c^{c}\end{bmatrix}-\begin{bmatrix}x_{[i]}^{c}\\ 0\\ c^{c}\end{bmatrix}$$
(1)
  • \(\boldsymbol{\delta}(t_{e})_{[i]}\) is the relative rotation compensation of pixel \([i]\) at the exposure time of the LP image line \((t_{e})\) (\(3\times 1\) vector). This vector has the three components \(\delta_{x}\), \(\delta_{y}\), and \(\delta_{z}\).

  • \(\boldsymbol{R}_{c}^{m}(t_{s})\) is the rotation matrix from the camera frame \(c\) to the map frame \(m\) at the exposure time of the first LP image line in the LP image scene \((t_{s})\) (\(3\times 3\) matrix). The map frame is an arbitrary world-fixed reference frame.

  • \(\boldsymbol{R}_{c}^{m}(t_{e})\) is the rotation matrix from the camera frame to the map frame at the exposure time of the LP image line (\(3\times 3\) matrix).

  • \(x_{[i]}^{c}\) is the image x‑coordinate in the camera frame (scalar).

  • \(c^{c}\) is the principal distance of the camera in the camera frame (scalar).

\(\boldsymbol{R}_{c}^{m}\) is the product of the processed GNSS/INS orientations and camera boresight rotations. From \(\boldsymbol{\delta}\), only the \(\delta_{x}\) component is used. Successive LP image lines are spatiotemporally adjacent so that \(\delta_{y}\approx 0\) for all pixels. Once the corrections for the rotation compensations have been retrieved, linear interpolation in the image x‑dimension is used to form LP image scenes with local pixel neighborhoods suitable for key-point detection and description (Fig. 3).

Fig. 3
figure 3

Subset of an LP image scene showing parts of a running track with four lanes. The original LP image scenes (a) are compensated for rotation in the x‑dimension using linear interpolation (b). The general direction of platform motion is downwards in the figures

Key-point detectors require a 2D image neighborhood. Thus, the key points are detected in the rotation-compensated LP image scenes. The filtering, following the key-point descriptor matching, is done using RANSAC with epipolar geometry constraints. This epipolarity is described with hyperbolas in LP image scenes rather than epipolar lines as in frame images (Konecny et al. 1987; Orun and Natarajan 1994; Gupta and Hartley 1997). The platform velocity and orientation are close to constant in the rotation-compensated LP image scenes owing to the short time window of the image exposures in the LP image scene. Thus, the LP image scenes not only correct the neighborhood for key-point detection and matching, but they also minimize the effect of changes in platform orientation and velocity so that the key-point correspondences can be filtered with the use of fundamental matrix estimation from Gupta and Hartley (1997). Using this model, at least 11 key-point correspondences are needed for RANSAC filtering. Although other approaches exist, the simple RANSAC filtering is well-tested and is only done to remove gross outliers from the key-point correspondences. Further fine filtering with the removal of observation outliers is done in the following bundle adjustment.

The observations for the bundle adjustment are formed from the direct link between key-point correspondences from the image scenes and pixels in the LP image line. The pixel neighborhood used to identify the key-point correspondences is only approximate because the distortion in the y‑dimension is not corrected (Fig. 1). To account for this the observations are registered at the center of their respective pixel.

The radiometric inter-band differences make it challenging to retrieve reliable observations between spectral bands when the spectral bands are not merged before key-point detection. Thus, observations from HSI are only matched between the same spectral bands. A consequence of this is that the chromatic aberration can be examined through empirical testing, which will be shown in the following experiment.

3.2 Stochastic Modeling

Registering the observations to the center of their respective pixel implies that the observation errors are expected to be uniformly distributed within the extent of a pixel. Thus, the well-defined standard deviation (STD) of this standard uniform distribution is used as the basis for setting the observation precision in the least squares estimation. However, unmodeled effects also influence the observation uncertainty. Thus, the STD of the described uniform distribution only provides an optimistic baseline estimate for defining the observation weights for the following bundle adjustment.

3.3 Spline Interpolation of Trajectory Corrections

Exact timestamps are needed for the image lines with observations to successfully link them to the trajectory. In the presented method, only the trajectory corrections, i.e., corrections to platform position and orientation, are estimated temporally. It is assumed that the angular motion of the platform is captured by the IMU and compensated for with the GNSS/INS orientations. However, smooth trajectory errors remain in the GNSS/INS solution when doing data acquisition from airplanes, mostly owing to the limited observability of heading errors in the GNSS/INS processing. Owing to the limited number of observations provided per LP image line, the trajectory corrections are only estimated for time steps with a fixed interval.

A very long trajectory segmentation time step similar to the time of a flight strip may be used to limit the number of parameters, but also increases the risk of underfitting. Contrarily, a very short trajectory segmentation time step close to the frequency of image exposures may be used to imitate the traditional bundle adjustment. The latter will lead to excessively many trajectory parameters to estimate when using an LP camera, increase the risk of overfitting, and the estimator will possibly not converge. This is a result of the required six unknowns per time step and the practically impossible retrieval of enough observations to estimate the parameters per LP image line.

Trajectory corrections at the discrete exposure times of the LP image lines are needed when calculating the reprojection errors in the bundle adjustment, and thus a cubic spline interpolation of the corrections is done (Fig. 4). The corrections at the time steps with the fixed interval serve as nodes in the cubic spline. In practice, the technique makes it possible to limit the number of estimation parameters from the LP camera system by three orders of magnitude compared to estimating exterior orientations for each LP image line.

Fig. 4
figure 4

The GNSS/INS processing leads to an erroneous a‑priori trajectory. The corrections from the a‑priori to the true trajectory are smooth. Thus, the calculation of reprojection errors is dependent on the cubic spline interpolation of the updated trajectory corrections where corrections at time steps \(t_{i}\) are spline nodes

3.4 Bundle Adjustment from LP Image Lines

Expressing the observations per LP image line permits rigorous modeling with the pinhole camera model in the bundle adjustment. After the observation retrieval, the observations are exact per pixel in the LP image lines and these lines are treated as separate image exposures.

The background for bundle adjustment based on spline interpolation of trajectory corrections is documented in literature (e.g., Glira et al. 2019; Jonassen et al. 2023). This section recaps this work and puts it in context for bundle adjustment from LP image lines. More so for LP image lines than frame images, trajectory segmentation leads to a significant reduction in the number of parameters needed to provide an accurate solution to the least-squares estimation problem. The functional model used for the LP camera is the same as for pinhole cameras and gives the relationship between point coordinates in object and image spaces:

$$\boldsymbol{\hat{p}}^{c}_{[k]}=\frac{c^{c}}{p^{c}_{[k],z}}\begin{bmatrix}p^{c}_{[k],x}\\ p^{c}_{[k],y}\end{bmatrix}+\boldsymbol{\gamma}$$
(2)
  • \(\boldsymbol{\hat{p}}^{c}_{[k]}\) is the coordinates of the point \([k]\) in the camera frame (\(2\times 1\) vector). This may be expanded to account for additional image corrections, e.g., lens distortions or principal-point offset.

  • \(\boldsymbol{p}^{c}_{[k]}\) is the \(3\times 1\) vector between the camera projection center and the object point in the camera frame.

  • \(\boldsymbol{\gamma}\) represents additional image corrections; i.e., non-linear distortion and principal point (see Brown 1966). The principal point is expected to be known from a‑priori camera calibrations and its correction correlates highly with the tangential distortion. Thus, the principal point correction is not estimated. \(\boldsymbol{\gamma}\) is here defined from the radial distortion (coefficients \(K_{1}\) and \(K_{2}\)), tangential distortion (coefficients \(P_{1}\) and \(P_{2}\)), and principal point (\(x_{0}^{c}\)):

$$\boldsymbol{\gamma}=\begin{bmatrix}K_{1}(p^{c}_{[k],x})^{3}+K_{2}(p^{c}_{[k],x})^{5}+3P_{1}(p^{c}_{[k],x})^{2}\\ P_{2}(p^{c}_{[k],x})^{2}\end{bmatrix}+x_{0}^{c}$$
(3)

\(\boldsymbol{p}^{c}_{[k]}\) is expanded to be differentiable with respect to the parameters in an airborne mapping platform:

$$\boldsymbol{p}^{c}_{[k]}=\boldsymbol{\beta}_{p}^{c}\left(\boldsymbol{R}_{p}^{m}(t_{e})\right)^{-1}\left(\boldsymbol{p}^{m}_{[k]}-\boldsymbol{x}_{p}^{m}(t_{e})-\boldsymbol{R}_{p}^{m}(t_{e})\boldsymbol{l}^{p}\right)$$
(4)
  • \(\boldsymbol{\beta}_{p}^{c}\) is the camera boresight matrix (\(3\times 3\) matrix), i.e., the rotation matrix from the platform frame \(p\) to the camera frame.

  • \(\boldsymbol{R}_{p}^{m}(t_{e})\) is the rotation matrix from the platform frame to the map frame \(m\) at the exposure time (\(3\times 3\) matrix).

  • \(\boldsymbol{p}^{m}_{[k]}\) is the coordinates of the point in the map frame (\(3\times 1\) vector).

  • \(\boldsymbol{x}_{p}^{m}(t_{e})\) is the platform position in the map frame at the exposure time (\(3\times 1\) vector).

  • \(\boldsymbol{l}^{p}\) is the lever-arm from the platform origin to the camera optical center expressed in the platform frame (\(3\times 1\) vector).

\(\boldsymbol{p}^{m}_{[k]}\) is the tie-point coordinates defined in object space and its initial state is automatically estimated as the point of intersection between the image rays from two observations of the tie point. Corrections to the tie-point coordinates are estimated in the bundle adjustment, and observation outliers are identified and rejected in each iteration based on their reprojection errors.

The updates for the time-dependent \(\boldsymbol{R}_{p}^{m}(t_{e})\) and \(\boldsymbol{x}_{p}^{m}(t_{e})\) are computed by interpolating their corrections at \(t_{e}\) using cubic splines (Fig. 4). This is done to calculate the reprojection errors, i.e., the observation model residuals, in the iterative least squares estimation.

3.5 Chromatic Aberration in HSI Cameras

Owing to chromatic aberration the same features are registered with different image coordinates in the simultaneously acquired spectral bands. The effect inherently leads to inconsistent magnification for the different spectral bands.

Given the accurate camera model, the chromatic aberration is examined through empirical testing by estimating different principal distance corrections for different spectral bands. A common set of camera and trajectory correction parameters are otherwise used for all the spectral bands. This examination of the chromatic aberration is possible owing to the rigorous bundle-adjustment implementation.

4 Experimental Results

A data set was acquired at the Norwegian University of Life Sciences in Ås, Norway (59.665\({}^{\circ}\) N, 10.775\({}^{\circ}\) E) in October 2022 to demonstrate the method. The area of 600 m \(\times\) 600 m was surveyed with an airplane equipped with a gyro-stabilized HySpex VNIR-1800 LP HSI camera and an Applanix POS-AV 510 INS. The data acquisitions were done at two altitudes of 1875 and 2500 m AGL (Fig. 5). All flight strips were covered twice with opposite headings. The acquisition at 2500 m AGL consisted of eight flight strips and the one at 1875 m AGL of 16 flight strips, which resulted in a total of 24 flight strips. The flying speed of 67 m\(/\)s yielded ground sampling distances of 0.3 and 0.4 m for the two flight altitudes. The LP HSI camera had 1800 pixels per LP image line and a total of 186 uniformly distributed spectral bands. The bands covered wavelengths between 404–994 nm with a spectral bandwidth of 4.9 nm measured at full width at half maximum. The image sensor had a pixel size of \(6.5\,\upmu\text{m}\), an initial principal distance of 4 cm, and the total field of view (FoV) of the camera was 16.6\({}^{\circ}\). This resulted in ground swaths of 547 and 730 m from the two flight altitudes. The processed GNSS/INS solution used as input to the bundle adjustment had a positional STD of \(<1.3\) cm in each of east and north, and \(<2.0\) cm in height.

Fig. 5
figure 5

Flight strips where data was acquired at 1875 m AGL (blue and orange lines), GCPs (red triangles), CPs (yellow circles), and permanent GNSS reference station (green square). All flight strips were covered twice with opposite headings. Eight of the flight strips covering the center of the study area were flown at 2500 m AGL (orange lines), yielding a total of 24 flight strips

Seventeen white 1.2 m \(\times\) 1.2 m reference squares were placed on the ground within the survey area (Fig. 6). Four of these were used as GCPs in the adjustment and 13 as check points (CPs), i.e., tie points with known coordinates for accuracy assessment. The center points of the reference squares were measured using real-time kinematic (RTK) GNSS positioning on 2–3 repeated visits on the same day as the aerial data acquisition. This RTK-based reference data had an estimated precision of \(\sim 1\) cm from the mean position of the repeated visits and a baseline of \(<1\) km to the nearest permanent GNSS reference station.

Fig. 6
figure 6

Example of a 1.2 m \(\times\) 1.2 m reference square as seen from the ground (a) and in a true color composite LP image scene from the HSI camera (b)

LP image scenes were created from sequences of 200 LP image lines. This corresponded to intervals of \(\sim 1\) s and led to a \(|\delta_{x}|\) with root mean square (RMS) of 0.91 pixels and a maximum of 4.43 pixels (see Eq. 1). The observation retrieval was done separately for seven different spectral bands with \(\sim 77\,\text{n}\text{m}\) spacing, i.e., about every 24th spectral band, and key-point descriptors were only matched between the same bands. Considering the narrow spectral bandwidth, the spectral bands were considered radiometrically independent and the chromatic aberration could thus be examined from the experiment. This pre-processing resulted in 441 587 2D observations. Additional experiments with 100 and 300 LP image lines per image scene resulted in 352 431 and 385 166 2D observations, respectively.

A trajectory segmentation time step of some seconds has shown good performance for airborne platforms in earlier work (e.g., Glira et al. 2016; Haala et al. 2022; Jonassen et al. 2023). Specifically, Glira et al. (2016) showed accurate results from UAV data with time steps \(\leq 10\) s. Thus, the trajectory segmentation time step was set to 10 s.

Corrections to boresight angles (three parameters), radial lens distortions (two parameters), tangential lens distortions (two parameters), trajectory positions (three parameters per 10 s time segment), and trajectory orientation (three parameters per 10 s time segment) were constrained to a single parameter set for all spectral bands. However, the principal distance corrections were estimated separately for the seven different spectral bands. This was done to examine how the chromatic aberration impacted the estimation of the principal distance correction.

In addition to the observation uncertainty introduced by the feature detection (Sect. 3.1), other factors also impact this uncertainty, e.g., approximations in the used model for lens distortions, exactness in the model used for the atmospheric refraction of the image rays, and platform movement during the acquisition of an LP image line, i.e., the motion blur. However, the exact precision estimate of each observation is not feasibly obtained as several of these effects vary both temporally and spatially. Thus, the precision of the observations was conservatively defined as \(1/2\) pixel.

Only tie points observed in three or more LP image lines were used for matching which required three overlapping flight strips to provide observations for the bundle adjustment. This resulted in 113 758 used tie points in object space. A total of 342 407 unknowns were estimated, including parameters for each of the tie-point coordinate corrections.

The experiment was done with and without the use of the four GCPs. The remaining reprojection errors after convergence from the experiment with four GCPs are shown in Table 1 as minimum, maximum, median, mean, STD, and normalized median absolute deviation (NMAD).

Table 1 Reprojection error statistics of the 113 758 tie points after bundle adjustment with the use of the four GCPs. Values are given in pixel units

The overall CP error statistics after bundle adjustment with the use of the four GCPs are listed in Table 2. The statistics from the experiment without the use of any GCPs are shown in Table 3. The results show NMAD \(\sim 1/4\) of the GSD in each of east and north, whereas the relatively large height error is mainly owing to the small angle of intersection with the limited FoV. Figure 7 shows the meter-level precision estimates of the height component of the CPs.

Table 2 Overall error statistics of the 13 CPs after bundle adjustment with the use of the four GCPs
Table 3 Overall error statistics of the 13 CPs from the experiment without the use of the four GCPs
Fig. 7
figure 7

Precision estimates of the height component of the CPs after bundle adjustment with the use of the four GCPs

Figure 8 shows the corrected camera principal distance differences for the different spectral bands and their a‑posteriori precision estimates.

Fig. 8
figure 8

Principal distances estimated for the seven selected bands of the HySpex VNIR-1800 GCPs. The values are reduced to their mean (40.3 mm). These differences are \(\leq 11.8\,\upmu\text{m}\) with a‑posteriori precision estimates of 1.2–2.4 \(\upmu\)m (error bars)

The results show that bundle adjustment was performed so that high-quality orthoimages can be produced. Figure 9 shows the ground reference squares plotted over a mosaic of true color composites of HSI orthoimages from the acquisition at 1875 m AGL. The orthoimages were created using a 20 cm digital surface model.

Fig. 9
figure 9

Enlarged white ground reference squares show that they can be accurately found in the orthomosaic created from the experiment results. The background image is a mosaic of true color composites of HSI orthoimages from the acquisition at 1875 m AGL with a pixel size of 30 cm

5 Discussion

We concentrate the discussion of the results on the topics LP observation retrieval, bundle adjustment results, and chromatic aberration.

5.1 LP Observation Retrieval

The \(\delta_{x}\) value used to rotation-compensate the LP image lines in the x‑dimension is a simplification as a large relative orientation change within an LP image scene may also affect the image y‑dimension. Such large distortions will affect the radiometric quality and possibly the flight strip coverage. While side overlap is common in the acquisition of LP image lines, there is no forward overlap, and large image distortions of several pixels in the image y‑dimension are usually considered unintended deviations from the planned image flight.

Both Fig. 3 and the \(|\delta_{x}|\) values with RMS of 0.91 pixels and maximum of 4.43 pixels from the experiment show the need to rotation-compensate the LP image scenes for automatic observation retrieval using key-point detection and description dependent on spatial pixel neighborhoods, even when using a stabilized platform. The rotation compensation \(\delta_{x}\) is effective, however, a similar compensation for the mainly pitch-induced \(\delta_{y}\) is not as straightforward. While the \(\delta_{x}\) correction is orthogonal to the scanning direction, and thus independent from it, the \(\delta_{y}\) correction changes the scan direction. Depending on the sign of \(\delta_{y}\) the image neighborhood after the \(\delta_{y}\) correction may have empty lines or several original lines may be mapped to the very same corrected location, making it unclear which of the congruent lines should get assigned to the extracted feature point. Still, the missing \(\delta_{y}\) correction may explain the slightly larger NMAD for the y residuals (see Table 1). The rotation corrections will naturally be much larger with poorly stabilized camera systems. Thus, it would be interesting to investigate if the proposed method for rotation compensation is applicable also for more high-dynamic camera systems or for camera systems with IMUs that are not sufficiently accurate to capture the angular motion errors on a sub-pixel level. Small GSD also increases the need for more accurate compensation of the camera motion.

The LP image scenes have to be short enough to provide sufficiently small reprojection errors for efficient use of RANSAC with the changing platform velocity and orientations. On the other hand, the LP image scenes also have to be long enough to provide more than 11 key-point correspondences for outlier removal using RANSAC. A sub-optimal choice of the number of LP image lines to use in the creation of LP image scenes in the pre-processing may lead to fewer observations to be used in the bundle adjustment. In the experiment, the LP image scenes created from 200 LP image lines were used as they yielded more observations than LP image scenes created from 100 or 300 LP image lines.

HSI consists of hundreds of radiometric bands but only a single or very limited number of bands are used in modern key-point detectors and descriptors. Thus, this information is unexploited when detecting and describing key points from HSI cameras with sub-pixel chromatic aberration. The abundant radiometric information from HSI could be considered in future work to limit the spatial window used for key-point detection and description.

5.2 Bundle Adjustment from LP Image Lines

The results in Table 1 show that the remaining reprojection errors are at the sub-pixel level, but slightly larger in the image y‑dimension. This difference in reprojection errors in the two image dimensions is expected as the rotation compensation is only considered in the image x‑dimension and since the used pinhole model does not account for the motion blur. Still, the reprojection errors are similar to the expected observation precision. The error statistics of the CPs in object space show NMAD \(\sim 1/4\) of the GSD in each of east and north (Table 2). Table 3 also shows similar planimetric error statistics, which confirms that GCPs are not needed for high planimetric accuracy in the bundle adjustment from LP image lines. The large height errors in Table 2 are mainly caused by the poor base-to-height relation which is worsened by the lack of forward overlap. This also leads to poor precision estimates of the CPs (Fig. 7). A larger FoV or additional information about image depth would be needed to provide more precise height coordinates in object space.

The precision estimates in the height dimension of the CPs are relatively high compared to the GSD (Fig. 7), and worse for the CPs towards the middle of the survey area. This is a natural consequence of the small intersection angles from all observing LP image lines, as the LP image lines from the blue flight strips in Fig. 5 do not cover these CPs.

The trajectory modeling makes it possible to accurately estimate camera-specific parameters for other cameras, e.g., frame cameras, with common trajectory parameters from simultaneous acquisition. Data acquisition with the use of multiple cameras, regardless of them being LP or frame cameras, is expected to provide additional observations and complement each other in the estimation of trajectory corrections. Based on earlier work (Jonassen et al. 2023), LiDAR point clouds are also expected to provide additional observations in a hybrid adjustment with LP and frame images. Such a joint adjustment is expected to provide a better basis for the estimation of image depth.

5.3 Chromatic Aberration in LP HSI Cameras

The chromatic aberration can be approximately modeled as a separate principal distance correction for each spectral band. The integration time of the camera was set so that spectral bands above the red-edge spectral region at \(\sim 750\,\text{n}\text{m}\) were close to saturation in vegetation pixels. As the reference squares appear bright in all spectral bands (Fig. 6b), those located on grass provide weak neighborhood contrast in the bands above the red-edge spectral region. Thus, the reference squares are more often identified in the lower spectral bands and hence provide more observations in the lower spectral bands for the estimation of the principal distance correction. The principal distance correction difference from the mean of these corrections was \(\leq 11.8\,\upmu\text{m}\), with a‑posteriori precision estimates of 1.2–2.4 \(\upmu\)m, in the HySpex VNIR-1800 LP HSI camera (Fig. 8). The effect of this difference corresponds to a maximum absolute shift \(\leq 0.27\) pixels in the LP image line. Torkildsen and Skauli (2018) reported a smaller sub-pixel chromatic aberration in a camera from the HySpex manufacturer from a laboratory experiment.

The chromatic aberration may vary significantly between camera systems and manufacturers (e.g., Torkildsen and Skauli 2018). In the presented experiment, the results showed that the chromatic aberration was smaller than the observation precision in the LP image line (Fig. 8). Consequently, this would allow the use of a single principal distance for all spectral bands for this particular LP HSI camera.

6 Future Work

The number of LP image lines used to create the LP image scenes affects the amount of resulting key-point correspondences. Thus, the optimal number of LP image lines for this pre-processing step should be further examined for a wider range of dynamics.

A method to retrieve more accurate observations for the bundle adjustment from LP image lines and their respective precision estimates should be further investigated. A key property of HSI is the rich radiometry captured with narrow spectral bands. Thus, a descriptor for multi-band images, such as HSI, using smaller spatial neighborhoods where descriptors contain more radiometric information could be a topic for further study. Such a descriptor may improve the observation retrieval from LP image scenes and limit the number of observation outliers when the chromatic aberration is at a sub-pixel level.

The general trajectory correction model and weighting of the observations is a good foundation for multi-sensor adjustment from other sensor data such as frame images and LiDAR point clouds. Thus, experiments can in the future be expanded to take advantage of, without being dependent on, observations from fundamentally different sensors for use in a joint adjustment.

7 Conclusions

Accurate camera orientation is crucial for the generation of high-quality georeferenced orthoimages. Thus, we have proposed a method to retrieve observations to use in the bundle adjustment from an LP HSI camera. The experiment resulted in an accuracy of up to 0.08 m NMAD in each planimetric dimension, i.e., \(\sim 1/4\) of the GSD (0.3 and 0.4 m), which is better than what was reported in similar studies in the past, e.g., \(2.3\) times the GSD for Kim et al. (2021). The height error of 0.99 m NMAD was several times the GSD and was mainly limited by the small angle of intersection. The results from the presented bundle adjustment show sub-pixel NMAD of the reprojection errors.

The chromatic aberration is parameterized as principal distance corrections per spectral band in the presented bundle adjustment. The results show that the chromatic aberration is slightly worse than what has been reported from laboratory calibrations of cameras from the same camera model, but still small enough to be at a sub-pixel level in the LP image line.