1 Introduction

The development of vision-based navigation systems has been growing in recent years in aerospace applications. Image processing techniques and depth sensors have quickly advanced due to the rapid space sector growth [1,2,3]. One of the problems to be addressed by vision-based sensing is the development of effective technologies to support proximity operations, such as autonomous precision landing, docking, rendezvous, and capture.

Current technologies for proximity operations often rely on infrared beacons and depth sensors to estimate the desired state vector [4]. Previous NASA work has demonstrated success using photogrammetric approaches, such as the Advanced Video Guidance Sensor (AVGS), an inverse perspective algorithm that is sufficiently accurate within the 0–300 m range and can be used in a variety of guidance, navigation, and control operations, as seen on the Demonstration for Autonomous Rendezvous Technology (DART) [5, 6]. AVGS’ limitation involves its hardware requirements: state-of-the-art lasers and CMOS imaging sensors [7]. This prompted the development of a more portable and low-cost substitute, the Smartphone Video Guidance Sensor (SVGS). SVGS is based on an adaptation of the collinearity equations used by its predecessor and improved image processing techniques. SVGS hardware requirements are limited to a CPU in the host platform (in this case, an Android Smartphone), a camera, and a four-point illuminated target [6,7,8].

SVGS range can be scaled by changing the dimensions of the four-point beacon and has shown significant accuracy in all 6 DoF [9] for ranges up to 200 m. SVGS’s potential applications include autonomous planetary landing, payload delivery, and docking. Finally, it has already been previously compared in literature with other computer vision techniques, such as binary fiducial markers, for GPS-denied environments and had an overall similar precision landing performance as shown by Bautista et al. [10]

1.1 SVGS algorithm and concept of operation

SVGS is a software sensor that estimates an illuminated target’s 6 DoF position and attitude vector in a coordinate system attached to the host’s camera. It can be easily integrated into robotic systems and flight controllers. An example deployment of SVGS can be seen in Fig. 1 [9].

Fig. 1
figure 1

Smartphone Video Guidance Sensor (SVGS) Concept [9]

Fig. 2
figure 2

Coordinate System Frames. Image plane Represented in Blue [9]

The SVGS algorithm is based on an adaption of the collinearity equations developed by Rakoczy [7]. Considering the perspective center of the host platform camera (thin lens) located at point \({\textbf {L}}\) and the target at point \({\textbf {A}}\). The representation of \({\textbf {A}}\) in the captured image plane is defined as \({\textbf {a}}\), and the distance between \({\textbf {L}}\) and such plane, \({\textbf {f}}\), is the focal length of the host platform camera. In this context, Fig. 2 displays the target \(\langle\) X, Y, Z\(\rangle\) and image coordinate \(\langle\) x, y, z\(\rangle\) frames, the vector \(v_a\) denotes the distance between \({\textbf {L}}\) to \({\textbf {A}}\) in the target frame. Similarly, the vector \(v_a\) characterizes the distance between \({\textbf {L}}\) to \({\textbf {a}}\) in the image frame [9].

$$\begin{aligned} \varvec{v_A}=\left[ \begin{array}{l} X_A-X_L \\ Y_A-Y_L \\ Z_A-Z_L \end{array}\right] ; \quad \varvec{v_a}=\left[ \begin{array}{l} x_a-x_0 \\ y_a-y_0 \\ -f \end{array}\right] . \end{aligned}$$
(1)

The above-mentioned vectors can be correlated using Equation 2, where M represents the x, y, z spatial rotation matrix for the transformation between the target frame to the image frame, and k is the scaling factor [9].

$$\begin{aligned} \varvec{v_a}=k \varvec{M v_A} \end{aligned}$$
(2)

Generalizing Eq. (2), i.e., removing the a & A subscripts and reorganizing to solve for the image frame’s x, y, and z while dividing the result by z to account for the removal of the scaling factor. One can obtain the following Eq. [9]:

$$\begin{aligned} x &= f \frac{m_{11}\left( X-X_L\right) +m_{12}\left( Y-Y_L\right) +m_{13}\left( Z-Z_L\right) }{m_{31}\left( X-X_L\right) +m_{32}\left( Y-Y_L\right) +m_{33}\left( Z-Z_L\right) }+x_0=\varvec{F_x} \end{aligned}$$
(3)
$$\begin{aligned} y &= f \frac{m_{21}\left( X-X_L\right) +m_{22}\left( Y-Y_L\right) +m_{23}\left( Z-Z_L\right) }{m_{31}\left( X-X_L\right) +m_{32}\left( Y-Y_L\right) +m_{33}\left( Z-Z_L\right) }+y_0=\varvec{F_y} \end{aligned}$$
(4)

Where \(\varvec{m_{ij}}\) denotes the elements of the rotation matrix M. The 6 DoF vector in the image frame, \(\varvec{V}\), is described below [9]:

$$\begin{aligned} \varvec{V}=\left[ \begin{array}{llllll} X_L&Y_L&Z_L&\phi&\theta&\psi \end{array}\right] ^T \end{aligned}$$
(5)

Where \(\phi\), \(\theta\), and \(\psi\) are x, y, and z rotation angles. The linearized \(F_x\) and \(F_y\) terms using Taylor series expansion truncated after the second term results in the following expression [9]:

$$\begin{aligned} x=\varvec{F_x}\left( V_0\right) + \varvec{\frac{\partial F_x}{\partial V} }\Delta V+\varepsilon _x, \quad y=\varvec{F_y}\left( V_0\right) + \varvec{\frac{\partial F_y}{\partial V} }\Delta V+\varepsilon _y \end{aligned}$$
(6)

The initial guess for the 6 DoF vector is expressed by \(V_0\), while V is the actual state vector [9].

$$\begin{aligned} \Delta \varvec{V=V-V_0} \end{aligned}$$
(7)

The Taylor series approximation error in x and y are defined as \(\epsilon\)x and \(\epsilon\)y. In addition, each of the 4 LEDs yields an individual pair of \(F_x\) and \(F_y\) expressions, resulting in 8 equations per estimation that can be expressed in matrix form as follows [9]:

$$\begin{aligned} Y= & {} \left[ \begin{array}{c} x_1 \\ y_1 \\ .. \\ .. \\ x_4 \\ y_4 \end{array}\right] ; \quad Y_0=\left[ \begin{array}{c} F_{x_1} \\ F_{y_1} \\ .. \\ .. \\ F_{x_4} \\ F_{y_4} \end{array}\right] \end{aligned}$$
(8)
$$\begin{aligned} \varvec{H}= & {} \begin{bmatrix} \varvec{\dfrac{\partial {F_x}_1}{\partial V}}&\varvec{\dfrac{\partial {F_y}_1}{\partial V}}&\ldots&\varvec{\dfrac{\partial {F_x}_4}{\partial V}}&\varvec{\dfrac{\partial {F_y}_4}{\partial V}} \end{bmatrix} \end{aligned}$$
(9)
$$\begin{aligned} \varvec{Y}= & {} \varvec{Y}_{\varvec{0}}+\textit{\textbf{H V}}+\varvec{\varepsilon } \end{aligned}$$
(10)

To estimate the state vector, the system of equations is solved for a value V that minimizes the square of the errors, \(\epsilon\). The solution is added to the initial value of V, and the process is repeated until the residual errors are small enough, resulting in the final V value [9]

1.2 SVGS collinearity simplification

To simplify the process described above, the azimuth and elevation angles are measured relative to the vector that connects the perspective center to the target location, reducing Eqs. (3) and (4) to the following equation [5, 7, 9, 11]:

$$\begin{aligned} A_z = \tan ^{-1}{\biggl (\frac{x-x_0}{f}\biggl )} ; \quad E_l = \sin ^{-1}{\biggl (\frac{y-y_0}{\root \of {(x-x_0)^2+(y-y_0)^2 + f^2}}\biggl )} \end{aligned}$$
(11)

1.3 SVGS algorithm implementation and robustness

SVGS state estimation commences by capturing an image of the 4-LED target, which is converted using a binary mask that filters the image’s pixels based on a brightness threshold. Light blobs are identified using the resultant binary image via iteration, and their main characteristics are determined, e.g., mass, area, perimeter, inertia, and location. To reduce background noise, the blobs are filtered using geometric alignment, image processing, and statistical filters, yielding an optimized set of four target blobs. The centroid location of the resulting blobs is used in the state estimation algorithm, which provides the 6 DoF state estimation vector (Fig. 3).

When it comes to SVGS’ robustness, the main issue is light reflections, but using the statistical filter based on the greyscale image definitions, one can find the correct four target blobs by choosing the group with the smallest standard deviation (SD) of the ratio between the blob mass and area

$$\begin{aligned} M_{blob}= & {} \sum _{i=1}^{N} I_i \end{aligned}$$
(12)
$$\begin{aligned} A_{blob}= & {} \sum _{i=1}^{N} i \end{aligned}$$
(13)

Where M is the mass of the blob, and \(I_i\) represents the brightness of the i-th pixel belonging to a particular blob.

Fig. 3
figure 3

SVGS algorithm logic [9]

1.4 Sensitivity to LED pattern

Another aspect of SVGS is the sensitivity to the target LED pattern, which, although accounted for and mitigated in the geometric alignment. This requires especial consideration of geometric configurations that may be difficult to identify by the SVGS algorithm.

On Fig. 4, each blob is identified as P1, P2, P3, P4. Note that \({proj}_\textbf{x} \textit{P4}\) and \({proj}_\textbf{x} \textit{P3}\) should be ideally located at the midpoint of the distance between P2 and P1. Where \(\textbf{x}\) is the line \(\overline{\textit{P1} \textit{P2}}\).

Fig. 4
figure 4

LED geometric alignment

The following checks are performed based on the scalar values:

$$\begin{aligned}{} & {} -\epsilon _1 \le \frac{\mid {x1-x2}\mid }{x} \le \epsilon _1 \end{aligned}$$
(14)
$$\begin{aligned}{} & {} -\epsilon _2 \le \frac{a}{x} \le \epsilon _2 \end{aligned}$$
(15)

The \(\epsilon _{1,2}\) values can be tuned according to the expected LED target dimensions. Smaller a values are less likely to create a wrongful mirrored or rotated target interpretation by the SVGS algorithm at large orientation angles.

1.5 Objectives

This paper aims to emulate precision drone landing using SVGS in hardware-in-the-loop simulation and compare its performance to other existing technology, particularly the IRLock and LiDAR approach, which uses an infrared beacon and a laser range sensor to support automated landing.

2 Sensors and hardware description

2.1 Flight controller unit (FCU)

The flight controller unit used is the Holybro Pixhawk 4, running PX4 firmware [12]. The FCU has an STM32F765 as its central processor and an STM32F100 as an IO processor. The firmware allows several flight modes, but this investigation focuses on the Offboard mode, where the FCU receives the set points directly from a ground station computer running the simulation. The Pixhawk 4 uses internal accelerometer, gyro, magnetometer, and barometer to estimate attitude and position.

2.2 IRLock + LiDAR

IRLock is a navigation technology that uses a camera (PixyCam) with an embedded IR filter to capture images of an infrared beacon (MarkOne beacon) and estimate its x and y position (Fig. 5). A LiDAR sensor in conjunction with IRLock is necessary to obtain the z distance (range) estimate relative to a datum surface. The range sensor used was a Benewake TFmini Plus LiDAR Rangefinder with a range of 12 m.

2.3 Samsung Galaxy 8 + smartphone video guidance sensor (SVGS)

The CPU host platform used to run the SVGS application is a Samsung Galaxy 8, an Android Smartphone equipped with a thin lens camera (Fig. 6).

Fig. 5
figure 5

IRLock Beacon (Left) and IRLock Camera (Right)

Fig. 6
figure 6

SVGS Host Platform: Samsung Galaxy S8 smartphone

2.4 Programmable linear motion stage

The linear motion stage used in the investigation was controlled by the National Instruments motion controller (NI-PXI-7350) connected to a Parker Hannifin ERV56 linear actuator is driven by a DC servo motor with a high-resolution built-in encoder (accuracy of ± 2 arc min). The encoder reports the stage position to the motion controller PC.

2.5 Ground-station computer

A computer running Ubuntu 20.04 with ROS Noetic Desktop, Gazebo 11, and QGroundControl is used as the ground station. It is responsible for integrating the virtual environment that fuses the experimental data with the emulated Pixhawk 4 internal sensor data.

3 Software integration

3.1 Robot operating system (ROS)

The hardware-in-the-loop simulation used ROS (Robot Operating System). ROS is an interface that uses a Publisher/Subscription communications architecture and is commonly used in robotics applications (Fig. 7). This method enables discretizing the required software functionality into programs called nodes, which can support different missions. ROS also enables dialog between multiple packages. The packages used in this investigation are:

  1. 1.

    MAVROS: Allows communication with the FCU using MAVLink (FCU communication protocol and serial interpreter) via UART connection over USB [13].

  2. 2.

    AAS_HIL: Package used to represent the sensor data collected from the linear motion stage to emulate autonomous landing. It is also responsible for determining the setpoints sent to the FCU via MAVROS. Relevant nodes are the following:

    1. (a)

      “motion_control_AAS”: This node parses data (raw output of sensors) and calculates the absolute error between the encoder and sensor values. It is also responsible for sending the setpoints to the “px4_control” node.

    2. (b)

      “px4_control_AAS”: This node subscribes to the “motion_control” node topics. It obtains setpoint data and sends it to the flight computer at 30Hz.

    3. (c)

      “data_recording_AAS”: This node subscribes to relevant variables and records them in a CSV file at 25Hz for final data and error analysis.

Fig. 7
figure 7

ROS noetic architecture[14]

3.2 Gazebo 11

Gazebo is an open-source platform that allows the simulation of the flight behavior and dynamics of an Unmanned Aerial Vehicle (UAV) model [15]. It supports integration with PX4 firmware, allowing the internal sensors of the FCU to be simulated according to each environment’s definitions. For this study, the UAV model used was the Iris Quadcopter, and conditions were considered ideal (no external interferences).

3.3 QGroundControl

QGroundControl is the “ground control” software used to receive the emulated MAVLink messages from the FCU during simulation. It verifies the emulated sensors’ and UAV behavior, as shown in Gazebo (Fig. 8).

Fig. 8
figure 8

QGroundControl hardware-in-the-loop overview

4 Methodology

4.1 Experimental setup and description

Both IRLock + LiDAR and the Android smartphone running SVGS were secured to the programmable linear motion stage, and their corresponding beacons/targets were mounted at a fixed location, either perpendicular or parallel to the direction of motion for each axis being tested (Fig. 9). The linear motion stage received a sinusoidal input command of 0.05 Hz at different amplitudes for each axes, namely 0.42 m, 0.3 m, and 1 m, respectively for the x, y, and z-axis.

Fig. 9
figure 9

Test setup - motion control of the vision sensor relative to a fixed landing target

The test setup for the IRLock + LiDAR consisted of a flight controller (PX4) connected to an external voltage source, IRLock camera (x and y position estimates), LiDAR (z estimate), with the IRLock beacon mounted to a flat surface (Fig. 10). The SVGS setup consists of the Android smartphone and a 4-point illuminated target mounted at the same distance, Fig. 11.

Fig. 10
figure 10

IRLock + LiDAR Setup

Fig. 11
figure 11

SVGS test view (left) and luminous target (right)

The test setup is shown in Figs. 12 and 13. The estimated loop delay of data transmission between external sensors and FCU is 175ms to 230ms, with the mean SVGS delay being closer to the upper bound, and the mean LiDAR + IRLock delay being closer to the lower bound. These values are used as time delays for the Extended Kalman Filter algorithm (EKF2) used by the FCU firmware/software.

Fig. 12
figure 12

End-to-end setup IRLock + LiDAR

Fig. 13
figure 13

End-to-end setup SVGS

The Hardware-in-the-loop simulation aims to emulate the autonomous precision landing of a UAV with the visual sight of the landing target for each method, following the mission outlined in Figs. 14, 15, 16, 17:

Fig. 14
figure 14

CONOPS - autonomous precision landing using SVGS

Fig. 15
figure 15

Automated landing - operational logic using SVGS

Fig. 16
figure 16

CONOPS - autonomous precision landing using IRLock + LiDAR

Fig. 17
figure 17

Automated landing - operational logic using IRLock + LiDAR

4.2 HIL simulation

Each sensor’s raw output collected during motion stage experiments is parsed into the motion control ROS node for x, y, and z-axis, creating a tridimensional matrix. The hardware-in-the-loop simulation uses the data segment of the sinusoidal motion’s half period that ends at zero velocity. The z-axis encoder values are added to the distance between the motion stage and the landing target (beacon), scaling the values for comparison purposes. The PX4 FCU is connected to the ground-station computer via USB, which connects Gazebo to the FCU. Gazebo and QGroundControl are connected to MAVLink via User Datagram Protocol (UDP) at Gazebo’s port 14550, allowing the latter to communicate with the FCU. ROS communicates with the flight controller via MAVROS/MAVLink using the UDP local port 14557 (Fig. 18).

Fig. 18
figure 18

Simulation overview

5 Implementation

Simulated landing motion is generated by half-sinusoidal period for each axis. The values are resampled to enable time alignment, as the sampling frequencies for the encoder, SVGS, and FCU connected to IRLock and LiDAR differ. Data parsed into the motion control node consists of three matrices of thirteen rows and three columns, each representing one axis, x, y, and z. Each matrix row represents a setpoint sent to the FCU.

When the final setpoint is achieved (i.e., the last row of the raw sensor data matrix) the landing loop continues to operate such that after each iteration, a 0.1 m distance is decreased from the z-axis of the last setpoint value until the z-distance reaches 0 (landing achieved). The absolute position error after the final setpoint is reached is not reported, as shown in Fig. 19. The position error between each fused position estimate and the encoder value is defined as:

$$\begin{aligned} \begin{aligned} \sigma _{error} = \sqrt{(X_{fused}-X_{encoder})^2+(Y_{fused}-Y_{encoder})^2+(Z_{fused}-Z_{encoder})^2} \end{aligned} \end{aligned}$$
(16)
Fig. 19
figure 19

Logic of motion control node

5.1 ROS topics and Gazebo emulated sensor data

After starting the simulation, the Gazebo software and PX4 firmware allow the internal FCU’s accelerometers, gyros, magnetometers, and barometers to have their data emulated based on the flight path and dynamics of the UAV, as shown in Fig. 20.

Fig. 20
figure 20

Gazebo and FCU communication

The ROS architecture works as a Publisher/Subscription interface through topics. The topics pertinent to data analysis are:

  1. 1.

    mavros/local_position/pose: Estimated 6 DOF position and attitude of the vehicle. Data comes from fused emulated and experimental data used by the FCU’s IMU.

  2. 2.

    mavros/setpoint_raw/local: Topic where the setpoints are sent to the FCU.

  3. 3.

    mavros/state: Information about the flight mode (stabilized, position, offboard) and whether the vehicle is armed (motors on) or not.

  4. 4.

    aas_hil/currentError: Absolute error between the sensor and encoder setpoint.

  5. 5.

    aas_hil/setpoint: Vector message containing components of setpoint parsed by the motion control node.

  6. 6.

    aas_hil/customRoutineDone: Boolean that indicates if the offboard motion routine (once the switch to offboard is made) has been completed.

5.2 Sensor fusion

The PX4 Autopilot Software uses an Extended Kalman Filter algorithm (EKF2) to fuse the sensor data available. The EKF2 algorithm has been tested in several multirotor vehicles [12, 16]. This section presents the algorithm to illustrate fusion with SVGS estimates. The mathematical formulation of the PX4 Extended Kalman Filter is formulated as:

State Prediction:

$$\begin{aligned} \hat{x} (k+1\mid k) = f(\hat{x}(k), u(k)) \end{aligned}$$
(17)

The right-hand side term in Eq. 17 refers to the predicted state at \(t_{k+1}\) considering the measurements gathered until \(t_k\). Whereas the left-hand side term, \(f( \hat{x}(k), u(k))\), represents the UAV nonlinear model function depending on the \(x_k\) and control input, u(k)

$$\begin{aligned} P(k+1\mid k) = F(k)P(k\mid k)F(k)^T + Q(k) \end{aligned}$$
(18)

Where \(P(k+1\mid k)\) is covariance matrix predicted at \(t_{k+1}\), F(k) is the Jacobian of f in respect to the states at \(\hat{x}(k)\). Finally, Q(k) is the process noise covariance matrix.

State Correction:

The Kalman gain matrix is defined as follows:

$$\begin{aligned} \begin{aligned} K(k+1)&= P(k+1\mid k)H(k+1)^T[ H(k+1)P(k+1\mid k)H(k+1)^T \\&\quad + R(k+1) ]^{-1} \end{aligned} \end{aligned}$$
(19)

\(H(k+1)\) defines the Jacobian of the measurement function h with respect to the states at \(\hat{x}(k+1\mid k)\), and \(R(k+1)\) is the measurement noise covariance matrix. Hence, the predicted state is obtained as follows:

$$\begin{aligned} \hat{x}(k+1) = \hat{x}(k+1\mid k) + K(k+1)[ y(k+1) - h( \hat{x}(k+1\mid k))] \end{aligned}$$
(20)

where, \(y(k+1)\) is the measurement at \(t_{k+1}\), and \(h( \hat{x}(k+1\mid k))\) is the predicted measurement at \(t_{k+1}\) considering the predicted state \(\hat{x}(k+1\mid k)\).

Finally, the covariance matrix is updated once more, as shown in Eq. 21

$$\begin{aligned} P(k+1) = [ I - K(k+1)H(k+1) ] P(k+1\mid k) \end{aligned}$$
(21)

5.3 SVGS and IRLock + LiDAR Fusion

To perform sensor fusion within the PX4 framework, the estimation messages must be transmitted at 50 Hz if the associated covariance is not available or at 30 Hz including the covariance values [16]. Typically achieving 50Hz shouldn’t be a problem for IRLock + LiDAR; however, considering that SVGS running on Samsung Galaxy S8 has a reliable update frequency between 15Hz to 6Hz, it becomes a challenge to integrate it properly.

The solution is to use a Kalman Filter with the raw SVGS measurements as inputs to provide SVGS estimations at the required 30Hz. This technique has some implications, which will be further discussed in Innovation Test section.

5.3.1 SVGS covariance

The SVGS covariance was previously estimated using known motion profiles in accuracy assessment experiments [16]. The covariance estimation requires a test setup similar to the one presented in the Hardware-in-the-loop experiment by comparing sensor values to the motion stage encoder true values. The sensor noise covariance matrix is given by:

$$\begin{aligned} \varvec{\Theta } = E[\theta (t)\theta ^T(t)] =\mu I \delta (t-\tau ) \end{aligned}$$
(22)

Where E denotes the mathematical operator of expectation, \(\mu\) is an arbitrary design parameter, \(\delta (t-\tau )\) is the Kronecker delta function, \(\tau\) is the time delay between the motion stage and the SVGS measurements, and I is the identity matrix. Expanding:

$$\begin{aligned} \varvec{\Theta } = \mu \begin{bmatrix} \sigma _\theta ^2 &{} \sigma _{\dot{\theta }} \sigma _\theta &{} \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _\theta \\ \sigma _{\dot{\theta }} \sigma _\theta &{} \sigma _\theta ^2 &{} \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _\theta \\ \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{x}} \sigma _\theta &{} \sigma _x^2 &{} \sigma _{\dot{x}} \sigma _x &{} \sigma _{\dot{y}} \sigma _x &{} \sigma _{\dot{y}} \sigma _x \\ \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{x}} \sigma _\theta &{} \sigma _{\dot{x}} \sigma _x &{} \sigma _x^2 &{} \sigma _{\dot{y}} \sigma _x &{} \sigma _{\dot{y}} \sigma _x \\ \sigma _{\dot{y}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _x &{} \sigma _{\dot{y}} \sigma _x &{} \sigma _y^2 &{} \sigma _{\dot{y}} \sigma _y \\ \sigma _{\dot{y}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _\theta &{} \sigma _{\dot{y}} \sigma _x &{} \sigma _{\dot{y}} \sigma _x &{} \sigma _{\dot{y}} \sigma _y &{} \sigma _y^2 \end{bmatrix} \end{aligned}$$
(23)

The diagonal elements represent the variance, and the off-diagonal elements represent the covariance between state variables.

Considering \(\mu\) to be equal to 1.6 and the results obtained by Hariri et al., the noise covariance matrix yields:

$$\begin{aligned} \varvec{\Theta } = 1.6 \begin{bmatrix} 0.000027 &{} -0.000003 &{} 0 &{} 0 &{} 0 &{} 0 \\ -0.000003 &{} 0.000016 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0.006 &{} -0.0069 &{} 0 &{} 0 \\ 0 &{} 0 &{} -0.0069 &{} 0.0552 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0.0051 &{} 0.0003 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0.0003 &{} 0.0405 \end{bmatrix} \end{aligned}$$
(24)

The values displayed in Eq. 24 were used as a floor covariance matrix transmitted with the SVGS estimates to the flight controller.

5.4 Innovation test

For the sensor values to be successfully fused, they must pass an Innovation Test Ratio [12]. The mathematical formulation for the test is presented below

$$\begin{aligned} \nu (k+1) = y(k+1) - h(\hat{x}(k+1\mid k)) \end{aligned}$$
(25)

Where \(\nu\) is the innovation vector. Expanding:

$$\begin{aligned} \nu _{k+1}^T (R_{k+1} + H_{k+1} P_{k+1\mid k} H_{k+1}^T)^{-1} \nu _{k+1} \le InnovationThreshold^2 \end{aligned}$$
(26)

Here, the left-hand side represents the Mahalanobis distance. Therefore, if the distance exceeds the normalized user-defined threshold, the measurement is considered an outlier and is not effectively fused, i.e., rejected by the EKF2.

5.4.1 EKF2 gate size

Moreover, EKF2 allows the user to define a gate size, g, for each sensor, which is directly related to the sensor’s covariance matrix and used to obtain the acceptance region for the Mahalanobis distance. Increasing gate size causes an overall decrease in fusion accuracy, whereas decreasing the gate size may increase accuracy but reduce the number of measurements considered to be valid, potentially causing the system's observability matrix positive definiteness to not hold [12] (Table 1).

For this investigation, the following were considered:

Table 1 Gate size

6 Results

For all axes, SVGS shows greater similarity with encoder values. However, for the x and y-axis errors reported for both SVGS and IRLock + LiDAR, the discrepancy is not as significant as the one reported for the z-axis, Figs. 21 and 22. This is expected since both approaches use photogrammetric x-axis and y-axis methods. Also, the LiDAR sensor model used in the comparison fails to generate accurate results during the experiment, particularly for range values smaller than 4 m, Fig. 23.

Fig. 21
figure 21

Raw SVGS vs. IRLock, x-axis

Fig. 22
figure 22

Raw SVGS vs. IRLock, y-axis

Fig. 23
figure 23

Raw SVGS vs. IRLock, z-axis

Encoder data was also simulated in the Gazebo environment. The data segment used as input to the setpoint matrix is shown in Fig. 24 for all three axes.

Fig. 24
figure 24

Simulated motion and corresponding estimations by SVGS and IRLock/Lidar

The absolute error analysis is shown in Figs. 25 and 26 illustrates that SVGS is more accurate than IRLock + LiDAR in autonomous precision landing.

Fig. 25
figure 25

SVGS error in comparison to fused encoder data

Fig. 26
figure 26

IRLock + LiDAR error in comparison to fused encoder data

7 Conclusion

Hardware in the loop experiments presented here illustrates the Smartphone Video Guidance Sensor’s potential for navigation, guidance, and proximity operations of UAVs, in particular, precision landing. The hardware-in-the-loop methodology enabled the assessment of landing accuracy by comparing simulated results with data collected from controlled motion experiments. The proposed methodology also enabled performance comparison between SVGS and a leading state-of-the-art landing technology, IR Lock + LiDAR.

SVGS portability and low-cost show great potential to support applications in small satellite (CubeSats) operations, such as space debris removal and asteroid/planetary reconnaissance. Future work entails performing a similar comparison in actual landing missions in a GPS-denied environment.