1 Introduction

Swarm robotics offers to make micro air vehicle (MAV) applications more robust, flexible, and scalable (Şahin 2005; Brambilla et al. 2013). These properties pertain to a group’s ability to remain operable under loss of individual members and to reconfigure for different missions. Furthermore, a cooperating swarm of MAVs could execute tasks faster than any single MAV. The envisioned applications of such multi-agent robotic systems are plentiful. Examples of interest are: cooperative surveillance and/or mapping (Saska et al. 2016; Schwager et al. 2009a; Achtelik et al. 2012), localization of areas of sensory interest (e.g. chemical plumes) (Hayes et al. 2003; Schwager et al. 2009b), the detection of forest fires (Merino et al. 2006), or search missions in hazardous environments (Beard and McLain 2003). In order to deploy a team of MAVs for such applications, there are certain behaviors that the MAVs should be capable of, such as collision avoidance (Coppola et al. 2018; Roelofsen et al. 2015) or leader–follower/formation flight (Vásárhelyi et al. 2014; Hui et al. 2014; Gu et al. 2006). These tasks are accomplished by the MAVs through knowledge of the relative location of (at least) the neighboring MAVs in the group, for which several solutions can be found in literature.

Often used are external systems that provide a global reference frame within which agents can extract both their own and the other MAVs’ position. One example is (MCSs) (Schwager et al. 2009b; Mulgaonkar et al. 2015; Kushleyev et al. 2013; Michael et al. 2010; Turpin et al. 2012; Chiew et al. 2015; Hayes and Dormiani-Tabatabaei 2002). MCSs provide highly accurate location data, but only within the limited coverage provided by the system. Alternatively, (GNSS) can be used to provide similar location data (Gu et al. 2006; Saska et al. 2016; Vásárhelyi et al. 2014; Quintero et al. 2013; Hauert et al. 2011). Although GNSS is widely available, it has relatively low accuracy if compared to MCS and therefore large inter-MAV separation is required to guarantee safe flight (Nägeli et al. 2014). Furthermore, GNSS cannot reliably be used indoors due to signal attenuation (Liu et al. 2007) and can also be subject to multi-path issues in some urban environments or forests (Nguyen et al. 2016).

To increase the versatility of the solution, MAVs should thus use on-board sensors to determine the locations of neighboring MAVs. Often, vision based methods are employed, such as: onboard camera based systems (Nägeli et al. 2014; Iyer et al. 2013; Conroy et al. 2014; Roelofsen et al. 2015), or infrared sensor systems (Kriegleder et al. 2015; Stirling et al. 2012; Roberts et al. 2012). A drawback of these systems is that they have a limited field of view. This issue can be tackled by creating constructs with an array of sensors (Roberts et al. 2012) or by actively tracking neighboring agents (Nägeli et al. 2014) to keep them in the field of view. The first solution introduces a weight penalty, while the second solution severely limits freedom of motion and scalability as a consequence of the need for active tracking of neighbors. Therefore, neither solution is ideal for MAVs. A natively omni-directional sensor would be more advantageous; one such sensor is a wireless radio transceiver.

Guo et al. (2017) recently implemented an ultra wideband (UWB) radio-based system for this. Range measurements are fused with displacement information from each MAV to estimate the relative location between MAVs. However, their method suggests that each MAV must keep track of their own displacement with respect to an initial launching point. If this measurement is obtained through on-board sensors (for example, by integrating velocities) then this measurement can be subject to drift over time.

Alternatively, Coppola et al. (2018) demonstrated a Bluetooth based relative localization method. Rather than using displacement information, the velocities of the MAVs, the orientation, and the height were communicated between each other, and the signal strength was used as a range measurement.

Despite the promising results of range-based solutions, a drawback of the solutions by Coppola et al. (2018) and by Guo et al. (2017) is that the MAVs need knowledge of a common frame orientation. This is established by having each MAV measure their heading with respect to North, which would be typically done with magnetometers. Magnetometers are notoriously susceptible to the local disturbances in the magnetic field. In indoor environments, disturbances upwards of 80\(^\circ \) can occur (Afzal et al. 2010). The difficulty of establishing a reliable direction towards North in an indoor environment is a well known problem. Solutions are found in the form of complementary filters (Roetenberg et al. 2005, 2007; Afzal et al. 2011; Yuan et al. 2015), or the use of redundant magnetic sensors to compensate the local disturbances (Afzal et al. 2010; Li et al. 2006). However, a shared reference frame is not theoretically necessary for the purposes of range-based relative localization (Zhou and Roumeliotis 2008; Martinelli and Siegwart 2005).

Fig. 1
figure 1

Leader–follower flight with 3 Parrot Bebops, equipped with UWB modules. By estimating and communicating their relative range (R) and ego-motion (v), follower 1 (\(f_1\)) and follower 2 (\(f_2\)) are able to localize the leader and able to follow its trajectory with a certain time delay

The main contribution of this paper is an analysis of the consequences of removing the heading dependency in range based relative localization, leading to the development and implementation of a heading-independent relative localization and tracking method that is accurate enough for full on-board indoor leader–follower flight, as shown in Fig. 1. The analysis is provided by a formal observability analysis and by performing limit-case simulations. Differently from the work of Zhou and Roumeliotis (2008) and Martinelli and Siegwart (2005), the analysis also considers the inclusion of acceleration information, since this is commonly known by MAVs from their Inertial Measurement Unit (IMU). Furthermore, our analysis specifically focuses on the implications of removing a heading dependency on the performance of the relative localization filters and on the relative maneuvers that the agents can perform in order to guarantee that the filter remains observable. The observability analysis will show that the task of leader–follower flight is especially difficult with range-based relative localization methods, because it does not allow for the MAVs to fly parallel trajectories. We then use the insights gathered for the development and implementation of a heading-independent leader–follower system that we are able to use on-board of autonomous MAVs operating indoors. The MAVs rely only on on-board sensors, using UWB for both communication and relative ranging.

The structure of the paper is as follows. First, in Sect. 2, we compare the theoretical observability of range based relative localization systems both with and without a reliance on a common heading. The findings from Sect. 2 are verified through simulation in Sect. 3, where we also evaluate the difference in performance that can be expected. We carry this information forward in Sect. 4, where a heading-independent system is implemented on real MAVs, and where we show the results of our leader–follower experiments. The results are further discussed in Sect. 5. Finally, the overall conclusions are drawn in Sect. 6. Future work is discussed in Sect. 7.

2 Observability of the relative localization filter

In this section, an observability analysis is performed that specifically focuses on the practical implications of performing range based relative localization both with and without reliance on a common heading reference. Specifically, we will study the case where one MAV (denoted MAV 1) tracks another MAV (denoted MAV 2). Despite our focus on MAVs in particular, the conclusions that follow hold for any general system that can provide the same sensory information. Furthermore, the results can be extrapolated to more than two MAVs, as will be demonstrated in Sect. 4.

2.1 Preliminaries

We will conduct the analysis by studying the local weak observability of the systems (Hermann and Krener 1977). With an analytical test, briefly introduced in the following, local weak observability can be used to extract whether a specific state can be distinguished from other states in its neighborhood.

Consider a generic non-linear state-space system \(\mathbf {\sum }\):

$$\begin{aligned} \dot{\mathbf {x}}&=\mathbf {f(x,u)} \end{aligned}$$
$$\begin{aligned} \mathbf {y}&= \mathbf {h(x)} \end{aligned}$$

The system \(\sum \) has state vector \(\mathbf {x}=[x_1,x_2,\dots x_n]^\intercal \in \mathbb {R}^n\), an input vector \(\mathbf {u}\in \mathbb {R}^l\), and an output vector \(\mathbf {y}\in \mathbb {R}^m\). The vector function \(\mathbf {f(x,u)}\) contains the definitions for the time derivatives of all the states in \(\mathbf {x}\) and the vector function \(\mathbf {h(x)}\) contains the observation equations for the system. The Lie derivatives of this system are:

$$\begin{aligned} \mathcal {L}_{\mathbf {f}}^0 \mathbf {h}&= \mathbf {h} \end{aligned}$$
$$\begin{aligned} \mathcal {L}_{\mathbf {f}}^1 \mathbf {h}&= \nabla \otimes \mathcal {L}_{\mathbf {f}}^0 \mathbf {h} \cdot \mathbf {f} \end{aligned}$$
$$\begin{aligned} \vdots \nonumber \\ \mathcal {L}_{\mathbf {f}}^i \mathbf {h}&= \nabla \otimes \mathcal {L}_{\mathbf {f}}^{i-1} \mathbf {h}\cdot \mathbf {f} \end{aligned}$$

where \(\otimes \) is the Kronecker product and \(\nabla \) is the differential operator, defined as \(\nabla =[\frac{\partial }{\partial x_1},\frac{\partial }{\partial x_2},\dots ,\frac{\partial }{\partial x_n}]\). Note that, accordingly, \(\nabla \otimes \mathbf {h}\) is equivalent to the Jacobian matrix of \(\mathbf {h}\). Using these definitions, an observability matrix \(\mathcal {O}\) can be constructed, as in Eq. 6.

$$\begin{aligned} \mathcal {O} = \left[ {\begin{array}{*{20}{c}} {\nabla \otimes \mathcal {L}_{\mathbf {f}}^0 \mathbf {h} }\\ {\nabla \otimes \mathcal {L}_{\mathbf {f}}^1 \mathbf {h} }\\ {\vdots }\\ {\nabla \otimes \mathcal {L}_{\mathbf {f}}^i \mathbf {h} } \end{array}} \right] , \quad \ i\in \mathbb {N} \end{aligned}$$

A system is locally weakly observable if the observability matrix is full rank.

2.2 Reference frames

For the analyses that follow, consider the reference frames schematically depicted in Fig. 2. Denoted by \(\mathcal {I}\) is the Earth-fixed North-East-Down (NED) reference frame, which is assumed to be an inertial frame of reference. Denoted by \(\mathcal {H}_i (i=1,2)\) is a body-fixed reference frame belonging to MAV i. Its origin is coincident with MAV i’s centre of gravity, and its location with respect to the \(\mathcal {I}\) frame is represented by the vector \(\mathbf {p_i}\). \(\mathcal {H}_i\) is a horizontal frame of reference, such that the z-axis of the \(\mathcal {H}_i\) frame always remains parallel to that of the \(\mathcal {I}\) frame. The \(\mathcal {H}_i\) frame is rotated with respect to the \(\mathcal {I}\) frame only about the positive z-axis by an angle \(\psi _i\), where \(\psi _i\) is the heading that MAV i has with respect to North, also referred to as its yaw angle. The rate of change of \(\psi _i\) is represented by \(r_i\).

Note that the \(\mathcal {H}_i\) frame is different from a typical body-fixed frame \(\mathcal {B}_i\), which uses the three Euler angles for roll, pitch, and yaw to represent the MAVs physical orientation with respect to the \(\mathcal {I}\) frame. Using \(\mathcal {H}_i\) rather than \(\mathcal {B}_i\) simplifies the kinematic relations without having to impose assumptions on the MAVs flight condition (e.g., being in a near-hover state with small roll and pitch angles).

Fig. 2
figure 2

Reference frames used in this paper. Frame \(\mathcal {I}\) in purple is the earth-fixed North East Down frame (assumed to be inertial). Frames \(\mathcal {H}_1\) (blue) and \(\mathcal {H}_2\) (red) are body fixed reference frames for MAVs 1 and 2, respectively (Color figure online)

2.3 Nonlinear system description

We shall study the case where MAV 1 attempts to estimate the relative position of MAV 2. We use \(\mathbf {p}\) to denote this relative position, such that \(\mathbf {p}=\mathbf {p_2}-\mathbf {p_1}\) (see Fig. 2). Furthermore, let \(\mathbf {v_i}\) and \(\mathbf {a_i}\) be the linear velocities and accelerations of frame \(\mathcal {H}_i\) with respect to frame \(\mathcal {I}\) expressed in frame \(\mathcal {H}_i\), respectively. Finally, let \({\varDelta }\psi \) represent the difference in heading between MAVs 1 and 2, such that \({\varDelta }\psi = \psi _2 - \psi _1\).

Since the horizontal plane of \(\mathcal {H}_i\) matches the horizontal plane of \(\mathcal {I}\), height of the MAVs from the ground can be treated as a decoupled dimension. This does not affect the observability result as long as the MAVs are both capable of measuring and comparing their own height, which is the case. Therefore, for brevity, height will not be included in the following analysis. The vectors for the relative position \(\mathbf {p}\), the velocity \(\mathbf {v_i}\), and the acceleration \(\mathbf {a_i}\) can thus be expanded as 2D vectors: \(\mathbf {p}^\intercal = [p_x,p_y]^\intercal \), \(\mathbf {v_i}=[v_{x,i},v_{y,i}]^\intercal \), \(\mathbf {a_i}=[a_{x,i},a_{y,i}]^\intercal \), \(i=1,2\).

The rate of change of \({\varDelta }\psi \) is \({\varDelta }\dot{\psi } = r_2-r_1\). Note that the value for \(r_i\) is not equal to the yaw rate as would commonly be measured by an on-board rate gyroscope in the body frame \(\mathcal {B}_i\). Instead, \(r_i\) is expressed as:

$$\begin{aligned} r_i = \frac{sin(\phi _i)}{cos(\theta _i)} \tilde{q}_i + \frac{cos(\phi _i)}{cos(\theta _i)} \tilde{r}_i \end{aligned}$$

where \(\tilde{q}_i\) and \(\tilde{r}_i\) represent the true pitch and yaw rate as would be measured by a rate gyroscope, and \(\phi _i\) and \(\theta _i\) are the roll and pitch angles of the MAV. However, for the sake of simplicity, \(r_i\) will be referred to as the MAV’s yaw rate.

Similarly, \(\mathbf {a_i}\), which is the value for the linear acceleration of the \(\mathcal {H}_i\) frame expressed in coordinates of the \(\mathcal {H}_i\) frame, is not equal to what is measured by the on-board accelerometer. Instead, it is equal to:

$$\begin{aligned} \mathbf {a_i} = \left[ {\begin{array}{*{20}{c}} {c(\theta _i)}&{}{s(\phi _i)s(\theta _i)} &{} {c(\phi _i) s(\theta _i)})\\ {0}&{}{c(\phi _i)} &{} {-s(\phi _i)} \end{array}} \right] \mathbf {s_{i}} \end{aligned}$$

where \(\mathbf {s}_{\mathbf{i}}\) is the specific force measured in the body frame \(\mathcal {B}_i\) by the accelerometer of MAV i. Furthermore, \(c(\alpha )\) and \(s(\alpha )\) represent short hand notation for \(cos(\alpha )\) and \(sin(\alpha )\), respectively. The matrix in this equation consists of the first two rows of the rotation matrix from the \(\mathcal {B}_i\) frame to the \(\mathcal {H}_i\) frame.

Following the above, the complete state vector of the system is given by \(\mathbf {x} = [\mathbf {p}^\intercal , {\varDelta }\psi , \mathbf {v_1}^\intercal , \mathbf {v_2}^\intercal ]^\intercal \), and the input vector is \(\mathbf {u}^\intercal = [\mathbf {a_1}^\intercal ,\mathbf {a_2}^\intercal ,r_1,r_2]^\intercal \). The continuous time state differential equations can be written as:

$$\begin{aligned} \dot{\mathbf {x}} = \mathbf {f(x,u)}= \left[ {\begin{array}{*{20}{c}} {-\mathbf {v_1}+\mathbf {R}\mathbf {v_2}-\mathbf {S_1 p}}\\ {r_2-r_1}\\ {\mathbf {a_1-S_1 v_1}}\\ {\mathbf {a_2-S_2 v_2}} \end{array}} \right] \end{aligned}$$

where \(\mathbf {R}\) is the 2D rotation matrix from frame \(\mathcal {H}_2\) to \(\mathcal {H}_1\):

$$\begin{aligned} \mathbf {R} = \mathbf {R}({\varDelta }\psi ) = \left[ {\begin{array}{*{20}{c}} {cos({\varDelta }\psi )}&{}\quad {-sin({\varDelta }\psi )} \\ {sin({\varDelta }\psi )}&{}\quad {cos({\varDelta }\psi )} \end{array}} \right] \end{aligned}$$

The matrices \(\mathbf {S_1}\) and \(\mathbf {S_2}\) are the skew-symmetric matrix equivalent of the cross product, adapted to the 2D case. The matrix \(\mathbf {S_i}\) is equal to:

$$\begin{aligned} \mathbf {S_i} = \mathbf {S_i}(r_i) = \left[ {\begin{array}{*{20}{c}} {0}&{}\quad {-r_i} \\ {r_i}&{}\quad {0} \end{array}} \right] ,\quad i = 1,2 \end{aligned}$$

The variables \(\mathbf {a_i}\) and \(r_i\) are inputs into the system and MAV 1 must thus have knowledge of these values. However, these are typically available from accelerometer and gyroscope data in combination with the appropriate relations given in Eqs. 7 and 8.

Finally, Eq. 9 needs to be complemented with an observation model. The MAVs should be able to measure the relative range between each other, along with their own and the other’s velocities. Then, the analysis that follows aims to study the difference between the following two scenarios: a scenario where the above measurements are the only measurements and a scenario where the MAVs are additionally capable of observing each other’s headings. The situation where the MAVs can observe a heading is referred to as \(\sum _A\) and the situation where a heading is not observed is referred to as \(\sum _B\).

  • \(\sum _{A}\): The scenario where \(\psi _1\) and \(\psi _2\) are observed is equivalent to \({\varDelta }\psi \) (the difference in headings) being observed. Therefore, for \(\sum _A\), the observation model is:

    $$\begin{aligned} \mathbf { y_A} = \mathbf {h_A(x)} = \left[ {\begin{array}{*{20}{c}} {h_{A1}(\mathbf {x})} \\ {h_{A2}(\mathbf {x})} \\ {\mathbf {h_{A3}}(\mathbf {x})}\\ {\mathbf {h_{A4}}(\mathbf {x})} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}}\\ {{\varDelta }\psi }\\ {\mathbf {v_1}}\\ {\mathbf {v_2}} \end{array}} \right] \end{aligned}$$

    Note that the observation equation \(h_{A1}(\mathbf {x})\) is slightly modified with regards to the previously mentioned measurements. Rather than observing the range between the two MAVs (i.e. \(||\mathbf {p}||_2\)), half the squared range is observed (i.e. \(\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\)). This change makes the observability analysis more convenient without affecting its result. Both \(||\mathbf {p}||_2\) and \(\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\) contain the same information as far as observability of the system is concerned (Zhou and Roumeliotis 2008).

  • \(\sum _{B}\): In this case, the headings of the MAVs are not measured, and it is thus not possible to observe the difference in heading \({\varDelta }\psi \) directly. For \(\sum _B\), the observation model is:

    $$\begin{aligned} \mathbf {y_B} = \mathbf {h_B(x)} = \left[ {\begin{array}{*{20}{c}} {h_{B1}(\mathbf {x})} \\ {\mathbf {h_{B2}}(\mathbf {x})}\\ {\mathbf {h_{B3}}(\mathbf {x})} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}}\\ {\mathbf {v_1}} \\ {\mathbf {v_2}} \end{array}} \right] \end{aligned}$$

The effect of the difference in the observation equations is studied in the following sections.

2.4 Observability analysis with a common heading reference

For system \(\sum _A\), which uses the observation model from Eq. 12, the first entry in the observability matrix is equal to:

$$\begin{aligned} \nabla \otimes \mathcal {L}_{\mathbf {f}}^0 \mathbf {h}_A = \nabla \otimes \mathbf {h}_A&= \left[ {\begin{array}{*{20}{c}} {\mathbf {p}^\intercal }&{}\quad {0}&{}\quad {\mathbf {0}_{1\times 2}}&{}\quad {\mathbf {0}_{1\times 2}}\\ {\mathbf {0}_{1\times 2}} &{}\quad {1} &{}\quad {\mathbf {0}_{1\times 2}} &{}\quad {\mathbf {0}_{1\times 2}}\\ {\mathbf {0}_{2\times 2}}&{}\quad {\mathbf {0}_{2\times 1}}&{}\quad {\mathbf {I}_{2\times 2}}&{}\quad {\mathbf {0}_{2\times 2}}\\ {\mathbf {0}_{2\times 2}}&{}\quad {\mathbf {0}_{2\times 1}}&{}\quad {\mathbf {0}_{2\times 2}} &{}\quad {\mathbf {I}_{2\times 2}} \end{array}} \right] \nonumber \\&= \left[ {\begin{array}{*{20}{c}} {\mathbf {p}^\intercal }&{}\quad {\mathbf {0}_{1\times 5}}\\ {\mathbf {0}_{5\times 2}}&{}\quad {\mathbf {I}_{5\times 5}} \end{array}} \right] \end{aligned}$$

where \(\mathbf {I}_{n\,\times \,n}\) represents an identity matrix of size \(n~\times ~ n\) and \(\mathbf {0}_{m\,\times \,n}\) represents a null matrix of size \(m~\times ~n\). We can already deduce simplifying information from Eq. 14 that will aid the subsequent analysis. First, note that, for the higher order terms in the observability matrix, the last 5 columns do not contribute to increasing its rank, because these columns are populated with an identity matrix. Furthermore, these higher order terms in the observation matrix (corresponding to the observations of \({\varDelta }\psi \), \(\mathbf {v_1}\), and \(\mathbf {v_2}\)) only have terms in those last 5 columns because none of the higher order Lie derivatives corresponding to those observations depend on the state \(\mathbf {p}\). For this reason, these need not be computed and we can thus omit them for brevity. The remainder of this analysis considers only the terms corresponding to observation \(h_{A1}(\mathbf {x})=\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\).

The first order Lie derivative corresponding to the observation \(h_{A1}(\mathbf {x})=\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\) is equal to:

$$\begin{aligned} \mathcal {L}_\mathbf {f}^1 h_{A1} = \mathbf {p}^\intercal (-\mathbf {v_1}+\mathbf {R} \mathbf {v_2}-\mathbf {S_1}\mathbf {p}) \end{aligned}$$

Next, remembering that \(\mathbf {S_1}\) is a skew symmetric matrix, such that \(\mathbf {S_1} + \mathbf {S_1}^\intercal = \mathbf {0}_{2\times 2}\), the following identity is obtained:

$$\begin{aligned} \frac{\partial \mathbf {p}^\intercal \mathbf {S_i} \mathbf {p}}{\partial \mathbf {p}} = \mathbf {p}^\intercal (\mathbf {S_i}+\mathbf {S_i}^\intercal ) = \mathbf {p}^\intercal (\mathbf {0}_{2\times 2}) = \mathbf {0}_{1\times 2} \end{aligned}$$

Using this identity, it can be verified that the second term in the observation matrix corresponding to \(h_{A1}(\mathbf {x})\) is:

$$\begin{aligned} \nabla \otimes \mathcal {L}_\mathbf {f}^1 h_{A1}=\left[ {\begin{array}{*{20}{c}} {-\mathbf {v_1}+\mathbf {R}\mathbf {v_2}} \\ {\mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2}}\\ {-\mathbf {p}}\\ {\mathbf {R}^\intercal \mathbf {p} } \end{array}} \right] ^\intercal \end{aligned}$$

At this point, it would be possible to continue calculating higher order terms for the observability matrix, but in practice this is not necessary. The first term of the observability matrix as shown in Eq. 14 already presents a matrix of rank 6. Since the state is of size 7, this means that only 1 more linearly independent row needs to be added to the observability matrix to provide local weak observability of the system. Furthermore, it is of practical interest to study the scenarios in which the system is locally weakly observable with a minimum amount of Lie derivatives involved in the analysis. This is due to the fact that in practice all signals are noisy, and the derivative of a noisy signal will be even noisier. It will be demonstrated that the terms presented in Eq. 17 are sufficient, under certain conditions, to make the observability matrix full rank.

As mentioned, Eq. 14 already shows that the last five columns of the observability matrix are no longer of interest to increase its rank. Furthermore, only the observation of \(h_{A1}(\mathbf {x})=\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\) provides non-zero terms in the first two columns of the observability matrix. Therefore, the following matrix can be constructed by collecting the terms of the first two columns in the observation matrix belonging to observation \(h_{A1}(\mathbf {x})\):

$$\begin{aligned} \mathbf {M_A} = \left[ {\begin{array}{*{20}{c}} {\mathbf {p}^\intercal } \\ {-\mathbf {v_1}^\intercal +\mathbf {v_2}^\intercal \mathbf {R}^\intercal } \end{array}} \right] \end{aligned}$$

where the first term is from the zeroth order Lie derivative (see Eq. 14) and the second term from the first order Lie derivative (see Eq. 17). The system is thus observable with a minimum amount of Lie derivatives if the matrix given by Eq. 18 has two linearly independent rows. By the definition of linear independence, this means that the following condition must hold to guarantee local weak observability of the system:

$$\begin{aligned} -\mathbf {v_1} + \mathbf {R}\mathbf {v_2} \ne c \mathbf {p} \end{aligned}$$

where c is an arbitrary constant.

The condition in Eq. 19 essentially tells us that the relative velocity of the two MAVs should not be a multiple of the relative position vector between the two. For more practical insight, we can extract more intuitive conditions that must also be met for Eq. 19 to hold. These conditions are:

$$\begin{aligned} ~\mathbf {p} \ne \mathbf {0}_{2\times 1} \end{aligned}$$
$$\begin{aligned} ~\mathbf {v_1} \ne \mathbf {0}_{2\times 1} \ \mathtt {or} \ \mathbf {v_2} \ne \mathbf {0}_{2\times 1} \end{aligned}$$
$$\begin{aligned} ~\mathbf {v_1} \ne \mathbf {R}\mathbf {v_2} \end{aligned}$$

The first condition tells us that the x and y coordinates of the relative position of MAV 2 with respect to MAV 1 should not be equal to 0. In practice, this would only be possible if the MAVs were separated by height, for otherwise their physical dimension would prevent this condition from occurring. The second condition tells us that one of the two MAVs needs to be moving to render the filter observable, and that the observability is indifferent to which of the MAVs is moving (hence the or operator). The third condition tells us that the MAVs should not be moving in parallel at the same speed, where the rotation matrix \(\mathbf {R}\) transforms \(\mathbf {v_2}\) to the \(\mathcal {H}_1\) frame.

Whilst these three conditions are easier to consider, it should be noted that they form only a subset of the conditions imposed by Eq. 19. For example, the scenario where MAV 2 is stationary, and MAV 1 flies straight towards MAV 2, does not violate any of these three conditions. It does, however, violate Eq. 19. Therefore, the observability of a state and input combination should be checked against the full condition in Eq. 19.

2.5 Observability analysis without a common heading reference

After determining the conditions under which system \(\sum _A\) is locally weakly observable, we compare it to the system where the heading measurements are no longer present. We now consider system \(\sum _B\), whose observation equation (Eq. 13) does not include the state \({\varDelta }\psi \). For this system, the first term in the observability matrix is:

$$\begin{aligned} \nabla \otimes \mathcal {L}_{\mathbf {f}}^0 \mathbf {h}_B = \nabla \otimes \mathbf {h}_B = \left[ {\begin{array}{*{20}{c}} {\mathbf {p}^\intercal }&{}{0}&{}{\mathbf {0}_{1\times 2}}&{}{\mathbf {0}_{1\times 2}} \\ {\mathbf {0}_{2\times 2}}&{}{\mathbf {0}_{2\times 1}}&{}{\mathbf {I}_{2\times 2}}&{}{\mathbf {0}_{2\times 2}}\\ {\mathbf {0}_{2\times 2}}&{}{\mathbf {0}_{2\times 1}}&{}{\mathbf {0}_{2\times 2}}&{}{\mathbf {I}_{2\times 2}} \end{array}} \right] \end{aligned}$$

Equation 23 is very similar to Eq. 14, but with the important difference that the row corresponding to the observation of \({\varDelta }\psi \) is null. Consequently, the matrix is only of rank 5, rather than rank 6. Since the state size is still 7, a minimum of two more independent rows must be added to the observability matrix to make the system locally weakly observable. Once again only the terms corresponding to the observation \(h_{B1}(\mathbf {x})=\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\) have terms that could increase the rank of the observability matrix. This means that this time a minimum of two more Lie derivatives must be calculated.

It can be verified that the first derivative \(\mathcal {L}_\mathbf {f}^1 h_{B1}\), and thus its state-derivative \(\nabla \mathcal {L}_\mathbf {f}^1 h_{B1}\), are exactly the same as for \(\sum _A\). Therefore, these need not be calculated anymore and are given by Eqs. 15 and 17, respectively. The second order Lie derivative is equal to:

$$\begin{aligned} \mathcal {L}_\mathbf {f}^2 h_{B1}&= (-\mathbf {v_1}^\intercal + \mathbf {v_2}^\intercal \mathbf {R}^\intercal )(-\mathbf {v_1}+\mathbf {R} \mathbf {v_2} - \mathbf {S_1} \mathbf {p}) \nonumber \\&\quad + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2}(r_2-r_1) -\mathbf {p}^\intercal (\mathbf {a_1} - \mathbf {S_1}\mathbf {v_1}) \nonumber \\&\quad +\, \mathbf {p}^\intercal \mathbf {R}^\intercal (\mathbf {a_2} - \mathbf {S_2}\mathbf {v_2}) \end{aligned}$$

Some terms in Eq. 24 can be seen to drop out when the equation is expanded. For example, the yaw rate of MAV 1 (\(r_1\)) cancels out completely. Therefore, Eq. 24 reduces to:

$$\begin{aligned} \mathcal {L}_\mathbf {f}^2 h_{B1}&= \mathbf {v_1}^\intercal \mathbf {v_1} + \mathbf {v_2}^\intercal \mathbf {v_2} - 2\mathbf {v_1}^\intercal \mathbf {R} \mathbf {v_2} + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} r_2 \nonumber \\&\quad - \mathbf {p}^\intercal \mathbf {a_1} + \mathbf {p}^\intercal \mathbf {R} \mathbf {a_2} - \mathbf {p}^\intercal \mathbf {R}^\intercal \mathbf {S_2} \mathbf {v_2} \end{aligned}$$

The state derivative of \(\mathcal {L}_\mathbf {f}^2 h_{B1}\) can then be shown to be equal to Eq. 26. Once again, note that some terms drop out (this step has been omitted for brevity).

$$\begin{aligned} \nabla \mathcal {L}_\mathbf {f}^2 h_{B1}=\left[ {\begin{array}{*{20}{c}} {\mathbf {a_1}+\mathbf {R}\mathbf {a_2} } \\ {-2 \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi }\mathbf {a_2}} \\ {2 \mathbf {v_1}- 2 \mathbf {R}\mathbf {v_2} }\\ {-2 \mathbf {R}^\intercal \mathbf {v_1} +2 \mathbf {v_2}} \end{array}} \right] ^\intercal \end{aligned}$$

Just as for \(\sum _A\), a part of the observation matrix can be extracted for analysis. This time, the first three columns in the observation matrix (as opposed to two) are collected for the observation \(h_{B1}(\mathbf {x})=\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\). Also, this time the terms up to and including the second order Lie derivative are minimally needed to obtain a full rank observability matrix. The following matrix is obtained:

$$\begin{aligned} \mathbf {M_B} = \left[ {\begin{array}{*{20}{c}} {\mathbf {p}^\intercal }&{}{0} \\ {-\mathbf {v_1}^\intercal +\mathbf {v_2}^\intercal \mathbf {R}^\intercal }&{}{\mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi }\mathbf {v_2}}\\ {-\mathbf {a_1}^\intercal +\mathbf {a_2}^\intercal \mathbf {R}^\intercal }&{}{-2 \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {a_2}} \end{array}} \right] \end{aligned}$$

In this case, obtaining the conditions for which this is a full rank matrix is less obvious due to the plethora of terms. Rather than directly demonstrating linear independence of the three rows in Eq. 27, the determinant \(\mathbf {|M_B|}\) may be computed and demonstrated to be non-zero. This is done as follows. Recall that \(\mathbf {p}^\intercal = [p_x,p_y]\). Furthermore, suppose \(-\mathbf {v_1}^\intercal +\mathbf {v_2}^\intercal \mathbf {R}^\intercal = [a,b]\) and \(-\mathbf {a_1}^\intercal +\mathbf {a_2}^\intercal \mathbf {R}^\intercal = [c,d]\). Then, matrix \(\mathbf {M_B}\) can be written as:

$$\begin{aligned} \mathbf {M_B} = \left[ {\begin{array}{*{20}{c}} {p_x} &{} {p_y} &{}{0} \\ {a} &{} {b} &{}{\mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2}}\\ {c} &{} {d} &{}{-2 \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi }\mathbf {a_2}} \end{array}} \right] \end{aligned}$$

The determinant of \(\mathbf {M_B}\) can be computed using a cofactor expansion along the last column of \(\mathbf {M_B}\). This results in:

$$\begin{aligned} |\mathbf {M_B}|&= -\mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} (d p_x-c p_y) \nonumber \\&\quad + \left( -2 \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {a_2}\right) (b p_x-a p_y) \end{aligned}$$

Now, the following identity can be used:

$$\begin{aligned} b p_x-a p_y = \left[ {\begin{array}{*{20}{c}} {a}&{b} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {-p_y} \\ {p_x} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {a}&{b} \end{array}} \right] \mathbf {A} \left[ {\begin{array}{*{20}{c}} {p_x}\\ {p_y} \end{array}} \right] , \end{aligned}$$

where \(\mathbf {A}=\left[ {\begin{array}{*{20}{c}} {0}&{}{-1}\\ {1}&{}{0} \end{array}} \right] \).

Substituting back the original expressions for [ab], [cd], and \([p_x,p_y]\), the determinant of \(\mathbf {M_B}\) becomes:

$$\begin{aligned} |\mathbf {M_B}|&= -\mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi }\mathbf {v_2} (-\mathbf {a_1}^\intercal +\mathbf {a_2}^\intercal \mathbf {R}^\intercal ) \mathbf {A} \mathbf {p} \nonumber \\&\quad + \left( -2 \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {v_2} + \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \mathbf {a_2}\right) \nonumber \\&\quad \times (-\mathbf {v_1}^\intercal +\mathbf {v_2}^\intercal \mathbf {R}^\intercal )\mathbf {A} \mathbf {p} \end{aligned}$$

This can be simplified and written as:

$$\begin{aligned} |\mathbf {M_B}|&=\left[ \mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \left( {-\mathbf {a_2} \mathbf {v_1}^\intercal + \mathbf {v_2} \mathbf {a_1}^\intercal } \right) \right. \nonumber \\&\quad + 2 \left. \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \left( {\mathbf {v_2} \mathbf {v_1}^\intercal - \mathbf {v_2} \mathbf {v_2}^\intercal \mathbf {R}^\intercal } \right) \right] \mathbf {A} \mathbf {p} \end{aligned}$$

This system is thus locally weakly observable with a minimum amount of Lie derivatives if \(\mathbf {|M_B|}\) is non-zero. Due to the specific properties of the \(\mathbf {A}\) matrix in this determinant (see Eq. 30), the following equation must hold to render the determinant \(|\mathbf {M_B}|\) non-zero:

$$\begin{aligned}&\mathbf {p}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \left( {-\mathbf {a_2} \mathbf {v_1}^\intercal + \mathbf {v_2} \mathbf {a_1}^\intercal } \right) \nonumber \\&\quad +\, 2 \mathbf {v_1}^\intercal \frac{\partial \mathbf {R}}{\partial {\varDelta }\psi } \left( {\mathbf {v_2} \mathbf {v_1}^\intercal - \mathbf {v_2} \mathbf {v_2}^\intercal \mathbf {R}^\intercal } \right) \ne k \mathbf {p}^\intercal \end{aligned}$$

where k is an arbitrary constant.

Just as for Eq. 19, we can extract a more intuitive subset of conditions for Eq. 33 that also definitely must be met for the system to be observable. These conditions are:

$$\begin{aligned} ~~\mathbf {p} \ne \mathbf {0}_{2\times 1} \end{aligned}$$
$$\begin{aligned}&(\mathbf {v_1} \ne \mathbf {0}_{2\times 1} \ \mathtt {or} \ \mathbf {a_1} \ne \mathbf {0}_{2\times 1}) \ \mathtt {and} \nonumber \\&\displaystyle (\mathbf {v_2} \ne \mathbf {0}_{2\times 1} \ \mathtt {or} \ \mathbf {a_2} \ne \mathbf {0}_{2\times 1}) \end{aligned}$$
$$\begin{aligned} ~~\mathbf {v_1} \ne s\mathbf {R v_2} \ \mathtt {or} \ (\mathbf {a_1} \ne \mathbf {0}_{2\times 1} \ \mathtt {or} \ \mathbf {a_2} \ne \mathbf {0}_{2\times 1}) \end{aligned}$$

where s is an arbitrary constant.

The first condition tells us that the determinant \(|\mathbf {M_B}|\) is zero if the x and y coordinates of the origins of frames \(\mathcal {H}_1\) and \(\mathcal {H}_2\) coincide. This is the same as for \(\sum _A\). The second condition tells us that both MAVs need to be moving. This movement may be either through having a non-zero velocity, or through having a non-zero acceleration (the violation of which is shown in Fig. 3a). The third condition tells us that the MAVs may not move in parallel, as in Fig. 3b, unless at least one of the MAVs is also accelerating at the same time. Note that this time the MAVs are not allowed to move in parallel regardless of whether they are moving at the same speed or not, hence the scalar multiple s.Footnote 1 By comparison, the equivalent condition for \(\sum _A\) only specified that the MAVs may not move in parallel at the same speed.

Fig. 3
figure 3

Representations of four unobservable state and input combinations. The relative position \(\mathbf {p}\), the velocities \(\mathbf {v_i}\), and the accelerations \(\mathbf {a_i}\) of MAVs 1 and 2 are depicted

Fig. 4
figure 4

Color map of observability for different relative positions. The velocities and accelerations of the MAVs are kept as depicted by Fig. 3 and the values for \(\mathbf {p}^\intercal =[p_x,p_y]^\intercal \) are varied over a \(10~\mathrm{m}\) range (Color figure online)

In order to study these intuitive conditions in further detail, we evaluated how the observability of the system is affected once the relative position \(\mathbf {p}\) between the MAVs changes. By varying the \(p_x\) and \(p_y\) values of the vector \(\mathbf {p}\) around the originally set values for \(\mathbf {p}\) (as in Fig. 3), we analyzed the observability of the system for different relative positions, while keeping the velocities and accelerations constant. The measure for observability was obtained by interpreting the meaning of Eq. 33. It essentially tells that the left hand side of the equation should not be parallel to the relative position vector \(\mathbf {p}\). Therefore, a practical measure of observability is how far away the left hand side of Eq. 33 is from being parallel to \(\mathbf {p}\), which can be tested with the cross product. The absolute value of the cross product is then used as a measure of the observability of the system. This paper considers a cross product less than a value of 1 to be unobservable. In theory, only when the cross product is 0 does it actually represent an unobservable condition. However, such a threshold facilitates visibility on the plots and provides insight on what the near-unobservable conditions are and their proportion in relation to the remaining conditions.

For the case of the second (Eq. (35), Fig. 4a) and the third intuitive condition (Eq. (36), Fig. 4c) it can be seen that a varying \(\mathbf {p}\) does not affect the unobservability in the color map. Once an acceleration vector is added to the state of MAV 1 in both cases, specifically \(\mathbf {a}_1~=[0.3~0.3]^\intercal \), the color plots in Fig. 4b, d show that for a set of relative positions, the system does become observable again. However, the chances of the MAVs ending up in an unobservable state are still significant within an operating area of \(100~\mathrm{m^2}\).

The three intuitive conditions we extracted are only a subset of all conditions imposed by Eq. 33. This means that there exist state and input combinations that satisfy the three intuitive conditions, but that do not satisfy Eq. 33. In order to study what the implications of the full unobservability condition in Eq. 33 are, we used the Nelder–Mead simplex method to find other points in the state and input space that violate the full observability condition. Two examples are shown in Fig. 3c, d. These scenarios do not violate any of the intuitive conditions given by Eqs. 3436. The relative position is non-zero, both MAVs have non-zero velocities and accelerations, and the velocity vectors are not parallel. Nevertheless, they violate Eq. 33. Based on this, color maps for the unobservable conditions in Fig. 3c, d are given in Fig. 4e, f, respectively.

Both color maps of Fig. 4e, f clearly show a non-linear relationship between the relative position vector \(\mathbf {p}\) and the observability of the system. Moreover, both maps show a different non-linear relationship. Figure 4e shows more of a hyperbolic relationship, whereas the unobservable region in Fig. 4f looks more elliptical. It can be shown that different conditions show yet other relationships between the observability of the system for different relative positions \(\mathbf {p}\). Moreover, these relationships only show what happens in two dimensions, which are for the two entries in the vector \(\mathbf {p}\). In reality, the observability condition in Eq. 33 presents an 11-dimensional problem. It is therefore still difficult to deduce general rules from these results. What the latter two color maps do have in common is that the unobservable relative positions are in all cases vastly outnumbered by the observable relative positions. This is different than what was observed for situations that would violate any of the more intuitive conditions in Eqs. 35 and 36.

2.6 Comparison of the two systems

Finally, the results from the observability analysis of both systems will be compared. These will show what the practical implications are when switching from a system that relies on a common heading reference to a system that does not.

A primary result of the analysis is that removing the relative heading measurement results in a system that requires at least one extra Lie derivative in the range observation to make the system locally weakly observable. This is an important result, because it tells us that the heading-independent system \(\sum _B\) relies more heavily on the range equation than \(\sum _A\). Without a heading observation, the range measurement serves to estimate a total of three states, as opposed to two in \(\sum _A\). Some of this information is contained in the second derivative of the range observation, and it is a well known fact that the derivative of a noisy signal will be even noisier. In practice, this means that any system that wishes to perform range-based relative localization without a heading dependency needs an accurate and low-noise range measurement.

Another important result is that the criteria posed for \(\sum _B\) specify that both MAVs must be moving. Contrarily, the criteria for \(\sum _A\) specify that only one of the MAVs must be moving. Whilst this result might not be as relevant for MAV teams, as the MAVs will typically be moving anyway, this result can be important for other applications of range-based relative localization. Think, for example, of the case where a single static beacon is used to estimate the position of a flying MAV using only range sensing and communication. The results of our analysis show that \(\sum _B\) is not observable in this case, and thus a common heading reference must be known for such a system to work or, alternatively, the MAV must track the beacon and then communicate its estimate back to the beacon. Note that, in the case where one of the participants is not moving, if we were to continue our analysis of \(\sum _B\) to higher order Lie derivatives then it would still not be possible to make the observability matrix full rank, so that the condition holds generally.

A third difference is found in the condition for parallel movement of the two MAVs. \(\sum _A\) requires that the MAVs should not move in parallel at the same speed, meaning that there should be a non-zero relative velocity between the two MAVs. Instead, \(\sum _B\) requires that the MAVs should not be moving in parallel regardless of speed. Therefore, even if the second MAV were to be moving twice as fast as the first, the filter would not be observable as long as the direction of movement is the same. However, \(\sum _B\) can bypass this condition in some cases if either of the MAVs is also simultaneously accelerating. Similarly, it can be shown that \(\sum _A\) is able to bypass the parallel motion condition with acceleration, although a second order Lie derivative would be necessary in that case.

3 Verification through Simulations

In this section, we further investigate the conclusions drawn from the analytical observability analysis. At first, a kinematic, noise-free study is performed to verify and confirm the differences in the observability conditions for \(\sum _A\) and \(\sum _B\). Afterwards, the influence of noise and disturbances on the filter are studied.

3.1 Filter design

The filter of choice, used throughout the rest of this paper, is an Extended Kalman Filter (EKF), since this type of filter fits intuitively with how the state-space system was described in Sect. 2. The EKF also uses a state differential model and an observation model. The state differential model can thus be kept exactly as the one given earlier in Eq. 9. The observation models for \(\sum _A\) and \(\sum _B\) are also kept almost the same as given in Eqs. 12 and 13, with the only adjustment that now the full range \(||\mathbf {p}||_2\) is observed, rather than half the squared range \(\frac{1}{2}\mathbf {p}^\intercal \mathbf {p}\). Additionally, in line with earlier research on range-based relative localization on real robots (Coppola et al. 2018), we decided to use an EKF on-board of the real-world MAVs because of its low processing and memory requirements.

An EKF has parameters that need to be tuned, namely: the initial state, the system and measurement noise matrices, and the initial state covariance matrix. The initial state is an important setting that will be described where appropriate in the next sections. The matrices are always tuned to correspond to the actual expected values. The measurement noise matrix is tuned based on the expected quality of the measurement variables, and similarly for the system noise matrix. However, since some of the simulations also make use of perfect measurements and since a zero entry in the measurement noise matrix is not possible, the corresponding entries are then given a small value of \(0.1~\mathrm{m}\). We use \(0.1~\mathrm{m}\) based on what is eventually used on the EKF on-board of the real MAVs. By using UWB antennas for range measurements, we can expect standard deviations of 0.1–\(0.3~\mathrm{m}\) around the true value. Our experimental set-up is described in Sect. 4.3.

3.2 Kinematic, noise-free study of unobservable situations

In the first simulated study, the two MAVs that are studied have kinematic trajectories that can be described analytically. The MAVs also have perfect noise-free knowledge of the inputs and measurements. The kinematic and noise-free situation is used to confirm conclusions drawn in the observability analysis performed in Sect. 2.

The two MAVs involved in the EKF are designated MAV 1 and MAV 2. MAV 1 shall be the host of the EKF and shall attempt to track the relative position of MAV 2, a.k.a. the tracked MAV. The latter does not contain an EKF. The following three scenarios are studied:

  1. 1.

    MAV 1 (host) is moving and MAV 2 (tracked) is stationary.

  2. 2.

    MAV 1 (host) is stationary and MAV 2 (tracked) is moving.

  3. 3.

    MAV 1 (host) and MAV 2 (tracked) are both moving in parallel to each other at different speeds.

These scenarios have been chosen because they match the intuitive conditions where \(\sum _A\) is observable, but \(\sum _B\) is not. These are limit cases and therefore provide valuable verification of the analytically found differences between the two systems.

The simulations will show whether these different scenarios have convergent EKFs or not. The focus of this analysis is on the estimation of the relative position \(\mathbf {p}\) and the relative heading \({\varDelta }\psi \). Since the velocities are observed directly, these are observable regardless of the situation, and are thus not shown.

The initial velocities of MAVs 1 and 2 are initialized to their true value, since these are not the variables of interest in this analysis. The initial position and relative heading are initialized with an error, the specifics of which will be given in the respective scenarios. The yaw rates and headings of both MAVs are kept at \(0~\mathrm{rad/s}\) and \(0~\mathrm{rad}\), respectively. The EKF runs at a frequency of \(50~\mathrm{Hz}\).

The error measure throughout this paper is the Mean Absolute Error (MAE). The separate x and y errors in the relative location estimate \(\mathbf {p}\) are combined according to the norm \(||\mathbf {p}||_2\). This choice was made because the separate errors in x and y directions offer little additional insight and are usually identical.

3.2.1 MAV 1 (host) moving, MAV 2 (tracked) stationary

Previous analytical analysis has shown that \(\sum _A\) is locally weakly observable, while \(\sum _B\) is not observable. This result is therefore expected to be reflected in the simulation as well.

Fig. 5
figure 5

\(\sum _A\) EKF convergence for case 1: MAV 1 (host) moving, MAV 2 (tracked) stationary

In the simulation, MAV 1 (the host) is positioned at \(\mathbf {p}_{1,0}^\intercal = [0,0]^\intercal \) and has a constant velocity \(\mathbf {v_1}^\intercal = [1,0]^\intercal \). MAV 2 (the tracked MAV) is positioned at \(\mathbf {p}_{2,0}^\intercal = [1,1]^\intercal \) with no velocity or acceleration. The initial guess of MAV 1 for the relative position and heading of MAV 2 is \([\hat{\mathbf {p}}_0^\intercal ,\hat{{\varDelta }\psi }_0]^\intercal = [0.1,0.1,1]^\intercal \). This means that the initial estimation error in \(p_x\), \(p_y\), and \({\varDelta }\psi \) is thus equal to 0.9, 0.9, and 1, respectively.

As can be seen in Fig. 5, both the relative position \(\mathbf {p}\) error and the relative heading \({\varDelta }\psi \) error quickly converge to 0. Contrarily, the observability analysis of \(\sum _B\) has shown that this scenario is not locally weakly observable, because the second condition is violated, i.e., one of the MAVs is not moving. However, Fig. 6 shows that the \(||\mathbf {p}||_2\) error converges to 0 just as rapidly as for \(\sum _A\). A more thorough inspection shows that the unobservable state of the system is in fact \({\varDelta }\psi \), which is the one that does not converge. This is a favorable result, since the relative position is typically the variable of interest, rather than the difference in heading.

The reason that this occurs lies in the information provided by the first state differential equation. This equation tells us that \(\dot{\mathbf {p}} = -\mathbf {v_1}+\mathbf {R}\mathbf {v_2}-\mathbf {S_1 p}\). The only dependency that this equation has on the relative heading \({\varDelta }\psi \) is in the rotation matrix \(\mathbf {R}\). Therefore, as long as \(\mathbf {v_2}\) is equal to \(\mathbf {0}\), the differential equation for \(\dot{\mathbf {p}}\) has no dependency on the relative heading between the two MAVs. The convergence of \(\mathbf {p}\) therefore remains unaffected. The situation changes when it is \(\mathbf {v_2}\) that is non-zero and \(\mathbf {v_1}\) that is zero. This case will be studied next.

Fig. 6
figure 6

\(\sum _B\) EKF convergence for case 1: MAV 1 (host) moving, MAV 2 (tracked) stationary

3.2.2 MAV 1 (host) stationary, MAV 2 (tracked) moving

For this case, all of the parameters are the same as for case 1, with the only difference being that now \(\mathbf {v_1}=\mathbf {0}\) and \(\mathbf {v_2}^\intercal = [1,0]^\intercal \). The analytical observability analysis has shown that this scenario is locally weakly observable for \(\sum _A\). As expected, it can be seen in Fig. 7 that both the errors for \(\mathbf {p}\) and \({\varDelta }\psi \) converge rapidly to 0. The observability analysis has then shown that \(\sum _B\) is not locally weakly observable in this scenario. Indeed, Fig. 8 shows that both \(||\mathbf {p}||_2\) and \({\varDelta }\psi \) do not converge and that \(||\mathbf {p}||_2\) diverges.

Fig. 7
figure 7

\(\sum _A\) EKF convergence for case 2: MAV 1 (host) stationary, MAV 2 (tracked) moving

This time, because \(\mathbf {v_2}\) is not equal to \(\mathbf {0}\), the state differential equation for the relative position of MAV 2 has a dependency on the relative heading state \({\varDelta }\psi \). Since \({\varDelta }\psi \) does not converge to its true value, and eventually settles at an error of approximately \(1.5~\mathrm{rad}\), there is a large inaccuracy in the state differential equation for \(\dot{\mathbf {p}}\). This consequently results in an ever increasing error in \(\mathbf {p}\), because MAV 1 essentially ‘thinks’ that MAV 2 is flying in a different direction than it really is.

This shows the reason as to why it is generally not possible for a stationary vehicle (or beacon) to be tracking a moving vehicle using range-only measurements and velocity information without a common heading reference. Contrarily, it is possible for a moving vehicle to be tracking a stationary vehicle or beacon’s position. This is entirely caused by the fact that a vehicle will always be ‘aware’, in its own body frame, of the direction it is moving in and hence does not need a convergent estimate of the relative heading with respect to the vehicle it is tracking. However, when the vehicle it is tracking does move, it needs this convergent estimate of the relative heading to know which direction the other is moving in.

Fig. 8
figure 8

\(\sum _B\) EKF convergence for case 2: MAV 1 (host) stationary, MAV 2 (tracked) moving

Fig. 9
figure 9

\(\sum _A\) EKF convergence for case 3: MAV 1 (host) and MAV 2 (tracked) moving in parallel

3.2.3 MAV 1 (host) and MAV 2 (tracked) moving in parallel at different speeds

Finally, the case where both MAVs are moving in parallel, but at different speeds, is studied. Once more, most of the parameters are kept the same as those presented under case 1. This time, the velocity of MAV 2 is set to \(\mathbf {v_2}^\intercal =[1,0]^\intercal \) and the velocity of MAV 1 is set in a parallel direction, but with twice the magnitude (\(\mathbf {v_1}^\intercal =2\mathbf {v_2}^\intercal =[2,0]^\intercal \)).

According to the observability analysis, this is one of the limit cases where \(\sum _A\) is still just observable, but \(\sum _B\) is not. Indeed, Fig. 9 shows convergent behavior for \(\sum _A\), whereas Fig. 10 shows divergence for \(\sum _B\). Note that the filter for \(\sum _B\) has a decreasing error in \({\varDelta }\psi \). However, the convergence of \({\varDelta }\psi \) is very slow. Furthermore, the error for \(\mathbf {p}\) continues to rise indefinitely.

This result concludes the noise-free simulations that compare the performance of the filters for \(\sum _A\) and \(\sum _B\). These simulations verify that the differences between the intuitive unobservable conditions that we derived for the two filters in Sect. 2 also hold true when translated to a simulation environment.

Fig. 10
figure 10

\(\sum _B\) EKF convergence for case 3: MAV 1 (host) and MAV 2 (tracked) moving in parallel

3.3 Kinematic noisy range measurements study of observable situation

Whilst a noise-free study demonstrates the feasibility of the proposed filter and can verify the differences between \(\sum _A\) and \(\sum _B\), it is also important to study the filter’s performance when presented with noisy data. Not only is this more representative of the filter’s performance in practice, but it also can be used to verify one of the main conclusions that were drawn in the observability study, namely that \(\sum _B\) needs information present in the second derivative of the range data to be observable, compared to only a first derivative for \(\sum _A\). It is consequently expected that, with all other parameters fixed, \(\sum _B\) will perform increasingly worse as the range measurement noise increases.

In this study, we steer away from unobservable scenarios. The intent now is to study both filter’s performances for the case where the filters are known to be observable, in order to compare their performance. For this reason, the trajectories of MAV 1 (host) and MAV 2 (tracked) are designed so as to stay clear of the unobservable situations and to excite the filter properly through relative motion. The trajectories that we devised for this study are perfectly circular, and we assume that the MAVs fly at the same height.

The trajectories, depicted in Fig. 11, can be described in polar coordinates \([\rho , \theta ]\). MAV 1 flies a circular motion at an angular velocity \(\dot{\theta _1}=\omega _1\) with radius \(\rho _1\), and MAV 2 flies at angular velocity \(\dot{\theta _2}=\omega _2\) with radius \(\rho _2\). To ensure that both MAVs have sufficient relative motion, one MAV flies clockwise and the other counter clockwise, such that \(\omega _1=-\omega _2\). Moreover, the radius of MAV 2’s trajectory is \(1~\mathrm{m}\) larger than MAV 1’s trajectory, and is offset by \(90^{\circ }\) in angle, such that \(\rho _1=\rho _2-1\) and \(\theta _1=\theta _2+\frac{\pi }{2}\).

Fig. 11
figure 11

Two circular trajectories for MAV 1 and MAV 2

The radius difference in the trajectories ensures that the situation \(\mathbf {p}=\mathbf {0}\) is avoided, and the angle offset ensures that the relative velocities are distributed more or less equally in x and y directions. In these simulations, for simplicity, both MAVs keep a steady heading such that \(\psi _1=\psi _2\) and \(r_1=r_2=0\).Footnote 2 Switching back to Cartesian coordinates, the trajectories can thus be analytically described as follows. MAV 2’s position vector in time is given by:

$$\begin{aligned} \mathbf {p_{2}}(t)=\left[ {\begin{array}{*{20}{c}} {\rho _2 cos(\omega _2 t)} \\ {\rho _2 sin(\omega _2 t)} \end{array}} \right] \end{aligned}$$

MAV 1’s position vector in time can be described by:

$$\begin{aligned} \mathbf {p_{1}}(t)&=\left[ {\begin{array}{*{20}{c}} {(\rho _2-1) cos(-\omega _2 t + \frac{\pi }{2})} \\ {(\rho _2-1) sin(-\omega _2 t+\frac{\pi }{2})} \end{array}} \right] \nonumber \\&= \left[ {\begin{array}{*{20}{c}} {-(\rho _2-1) sin(-\omega _2 t)}\\ {(\rho _2-1) cos(-\omega _2 t)} \end{array}} \right] \end{aligned}$$

The equations for \(\mathbf {v_i}(t)\) and \(\mathbf {a_i}(t)\) can be obtained by taking the time derivatives with respect to \(\mathbf {p_i}(t), \ i=1,2\). Note that this is not true for the general case, since \(\mathcal {H}_i\) is a rotating frame of reference, but in this case it is possible because the MAVs keep a constant heading equal to 0 rad.

By setting \(\rho _2=4~\mathrm{m}\) and \(\omega _2=\frac{2\pi }{20}~\mathrm{rad}\), the trajectory of MAV 2 becomes a circle with a radius of \(4~\mathrm{m}\) that is traversed in \(20~\mathrm{s}\). To comply with the previously defined constraints, \(\rho _1\) and \(\omega _1\) are \(3~\mathrm{m}\) and \(-\frac{2\pi }{20}~\mathrm{rad/s}\), respectively. These values are representative of what a real MAV should easily be capable of and result in relative velocities of about \(1~\mathrm{m/s}\) in x and y directions between the two MAVs.

The study will test the performance of the relative localization filter as seen from the perspective of MAV 1, which is tracking MAV 2. The filter is fed perfect information on all state and input values, except for the measurement of the range \(||\mathbf {p}||_2\) between the two MAVs. The range measurement are artificially distorted with increasingly heavy Gaussian white noise. The measured range fed to the filter is thus \(||\mathbf {p}||_{2,m}= ||\mathbf {p}||_2 + n(\sigma _{R})\), where \(n(\sigma _{R})\) is a Gaussian white noise signal with zero mean and standard deviation \(\sigma _{R}\). The standard deviations that are tested are \(0~\mathrm{m}\) (noise free), \(0.1~\mathrm{m}\), \(0.25~\mathrm{m}\), \(0.5~\mathrm{m}\), \(1~\mathrm{m}\), \(2~\mathrm{m}\), \(4~\mathrm{m}\), and \(8~\mathrm{m}\). In practice, a standard deviation of \(8~\mathrm{m}\) could be considered quite high, but this is intentionally chosen with the intent to observe a significant difference in the error. Since this study keeps all the other measurements and inputs noise free, the noise on the range measurement needs to be higher to get a significant increase in the localization error.

This time the EKF runs at \(20~\mathrm{Hz}\), which is more representative of our real-world set-up, discussed later in Sect. 4. The described flight trajectory is simulated for \(20~\mathrm{s}\) each run, which is thus one complete revolution of the circular trajectory. The EKF is initialized to the true state to exclude the effects of initialization.

For each particular noise standard deviation, both the filter for \(\sum _A\) and for \(\sum _B\) are simulated with 1000 different noise realizations. For each realization the MAE of the estimated \(\mathbf {p}\) with respect to its true value is computed, again by considering the combined error in the estimate of \(||\mathbf {p}||_2\). After 1000 realizations, the Average MAE (AMAE) is computed to extract the average performance for all noise realizations.

The resulting AMAE values for systems \(\sum _A\) and \(\sum _B\) are given in Table 1 and are plotted in Fig. 12. As expected, at very low noise values on the range measurement, both the filters for \(\sum _A\) and \(\sum _B\) have very similar error performance. With no noise on the range measurements, the difference between the two filters is only \(4~\mathrm{mm}\). However, since the filter for \(\sum _B\) is more sensitive to noise on the range measurements, it quickly starts to perform worse than \(\sum _A\) as the noise on the range measurement is increased.

Table 1 Average Mean Absolute Error for \(\sum _A\) and \(\sum _B\) over 1000 runs with different noise standard deviation on the range measurement
Fig. 12
figure 12

AMAE in estimate of \(||\mathbf {p}||_2\) for \(\sum _B\) and \(\sum _A\), the error bars indicate the standard deviation

This result is in line with the analytical results presented in Sect. 2. However, it also raises the question of whether removing the dependency on a common heading reference poses any advantage, since \(\sum _A\) performs consistently better than \(\sum _B\). The reason for this result lies in the fact that the studied scenario uses perfect measurements for all the sensors except for the measured range. As mentioned in the introduction, the heading observation is notoriously troublesome and unreliable, especially in an indoor environment (Afzal et al. 2010). Therefore, it would be valuable to study what would happen to this analysis in the case where the heading estimate is not perfect. This is presented next.

3.4 Kinematic noisy range measurements and heading disturbance study for observable situation

In order to compare the results obtained with an imperfect heading measurement to those obtained in the previous section, the same trajectories are simulated (as in Eqs. 38 and 37 for MAVs 1 and 2, respectively). All the other simulation parameters are also kept the same, with one exception. This time, a disturbance is introduced on the heading measurement. The simulated disturbance is modeled to look similar to how a real local perturbation in the magnetic field would perturb a heading estimate. The actual magnetic perturbation and the corresponding heading error are taken from the work of Afzal et al. (2010), where indoor magnetic perturbations are studied. It was found that the obtained disturbance on the heading estimate looks similar to a Gaussian curve, and in this analysis it is thus modeled as such.

The disturbance on the heading estimate in time d(t) is modeled as:

$$\begin{aligned} d(t) = A_d \cdot e^{-\left( \epsilon (t-t_0)\right) ^2} \end{aligned}$$

Here, the amplitude of the disturbance (in radians) is given by \(A_d\), the parameter \(\epsilon \) controls the width of the Gaussian curve, and \(t_0\) controls the location of the curve in time. For this study, \(\epsilon =1\), resulting in a disturbance lasting approximately \(4~\mathrm{s}\), and \(t_0=5~\mathrm{s}\), such that the disturbance peaks at \(5~\mathrm{s}\) into the flight. How such a disturbance looks is presented in Fig. 13 for an amplitude \(A_d\) of \(1~\mathrm{rad}\).

Fig. 13
figure 13

Disturbance on the relative heading measurement in time, for an amplitude \(A_d\) of \(1~\mathrm{rad}\)

Several amplitudes of the disturbance are tested, namely \(0~\mathrm{rad}\), \(0.25~\mathrm{rad}\), \(0.5~\mathrm{rad}\), \(1~\mathrm{rad}\), and \(1.5~\mathrm{rad}\). The final amplitude of \(1.5~\mathrm{rad}\) results in a maximum heading estimate error of almost \(85^{\circ }\), which is approximately equal to the amplitude of the disturbance shown by Afzal et al. (2010). Note that the disturbance is introduced directly on the measurement of \({\varDelta }\psi \) (the difference in headings between two MAVs). This is the situation that would occur if one of the two MAVs would fly in a locally perturbed area.

Since the parameter of interest is how the filter for \(\sum _B\) compares to the filter for \(\sum _A\), the results are represented as a percentage comparison of the relative localization errors between the two filters. This is visually presented in Fig. 14. In the figure, a positive % means that the filter for \(\sum _B\) performs worse than the filter for \(\sum _A\). At 0%, marked by a dotted line, both filters perform equally well.

The comparison shows that as the applied disturbance amplitude on the heading measurement provided to system \(\sum _A\) is increased, the region for which \(\sum _B\) performs better than \(\sum _A\) expands. In the case of the largest disturbance, with \(A_d\) equal to \(1.5~\mathrm{rad}\), filter \(\sum _B\) even performs better at a range noise \(\sigma _R\) equal to \(8~\mathrm{m}\).

This result reinforces the presumption that it is not always better to include a heading measurement in the filter, provided that the range measurement is of a high enough accuracy. We will use this insight for the real-world implementation. In the experimental set-up in Sect. 4, we will use Ultra Wide Band (UWB) radio modules to obtain range measurements between MAVs. To give an idea of what type of range noise standard deviations can actually be achieved in practice, in the executed experiments with real MAVs, the UWB modules resulted in ranging errors with standard deviations between 0.1 and \(0.3~\mathrm{m}\). If we assume a normally distributed ranging error, based on the results shown in Fig. 14, it is then clear that the heading-independent system \(\sum _B\) would be the preferred choice for all heading disturbance amplitudes (except, trivially, for the situation where there is little to no heading disturbance at all).

Fig. 14
figure 14

Percentage error comparison between \(\sum _B\) and \(\sum _A\) for different disturbance amplitudes \(A_d\). Positive percentage means \(\sum _B\) performs worse than \(\sum _A\)

4 Leader–follower flight experiment

In this section we demonstrate the heading-independent filter in practice, which is used for leader–follower flight in an indoor scenario.

4.1 Leader–follower flight considerations

Before designing an actual control method to accomplish leader–follower flight, let’s first reflect on the previous observability analysis results from Sect. 2 and their implications with respect to leader–follower flight. We know that in order to have an observable, heading-independent, system, the combined motion of the leader and follower has to meet the observability condition presented in Eq. 33. We further know that in order to to meet this condition, the three intuitive conditions presented by Eqs. 34 to 36 certainly have to be met. Let’s first consider these conditions:

  1. 1.

    The first condition (Eq. 34) specifies that the relative position between leader and follower must be non-zero. This condition has little implication to leader–follower flight, other than the fact that the follower must follow the leader at a non-zero horizontal distance, which typically is the objective.

  2. 2.

    The second conditions (Eq. 35) tells us that both MAVs must be moving. As far as leader–follower flight is concerned, this is automatically accomplished as long as the leader is not stationary.

  3. 3.

    The third condition (Eq. 36) is especially impactful for leader–follower flight. It specifies that the MAVs should not be moving in parallel (regardless of speed), unless they are also accelerating. A lot of research on leader–follower flight aims to design control laws that would result in fixed geometrical formations between different agents in the formation. This is typically achieved by specifying desired formation shapes, or desired inter-agent distances for members in the swarm (Turpin et al. 2012; Gu et al. 2006; Chiew et al. 2015; Saska et al. 2014). By the very nature of fixed geometries, that would result in parallel velocity vectors.

The third condition requires a different approach to leader–follower flight. Rather than flying in a fixed formation, it is also possible for the follower to fly a delayed version of the leader’s trajectory. As long as the leader’s trajectory is not a pure straight line for long periods of time, this will result in relative motion between the leader and follower. This is the approach taken in this paper.Footnote 3

This solution should also help to prevent the MAVs from getting stuck in an unobservable situation that is not covered by Eqs. 34 to 36, but that is covered by the full observability condition in Eq. 33. We concluded that for the scenarios that are numerically found to be unobservable according to Eq. 33, changing the relative position \(\mathbf {p}\) only slightly can already result in an observable situation. In the proposed method of having the follower fly a time-delayed version of the leader’s trajectory, the relative position vector \(\mathbf {p}\) will naturally change if the leader’s trajectory is not a straight line.

4.2 Leader–follower formation control design

We want to construct a leader–follower control method that results in the follower flying a delayed version of the leader’s trajectory. As it turns out, this type of control can be directly accomplished with the information provided by the relative localization filter.

Consider the schematic in Fig. 15. It shows two arbitrary trajectories in dotted lines. Just as for the previous section, MAV 1 tracks and follows MAV 2. At the top, in blue, is the trajectory for MAV 1, which is represented by its position vector in time \(\mathbf {p_1}(t)\). On the bottom, in red, is the trajectory for MAV 2, \(\mathbf {p_2}(t)\). Suppose the desire is for the follower (MAV 1) to follow the leader’s trajectory (MAV 2) with a time delay \(\tau \). The control problem for MAV 1 can be expressed as the desire to accomplish \(\mathbf {p_1}(t)=\mathbf {p_2}(t-\tau )\).

Fig. 15
figure 15

Control problem for leader–follower flight. In blue is MAV 1’s trajectory in time \(\mathbf {p_1}(t)\). In red is MAV 2’s trajectory in time \(\mathbf {p_2}(t)\). The desire is for MAV 1 to drive \(\mathbf {e}(t)\) to \(\mathbf {0}\) for \(t\rightarrow \infty \) (Color figure online)

Let \(t_n\) indicate the current time at which a control input must be calculated. At the current time, MAV 1 has a body fixed reference frame \(\mathcal {H}_1(t_n)\), whose origin is \(\mathbf {p_1}(t_n)\). At time \(t_n-\tau \), MAV 1 knows the relative position of the leader in its own body fixed frame \(\mathcal {H}_1(t_n-\tau )\), since this information is provided by the relative localization filter. However, for this control method to work, MAV 1 must have knowledge of where the leader’s old position is at the current time \(t_n\). This value of interest is depicted by the vector \(\mathbf {e}(t_n)\) in Fig. 15; it is the positional error with respect to the desired follower’s position at time \(t_n\).

Let \(\mathbf {R}_{\mathcal {H}_i(t_1) \mathcal {H}_i(t_2)}\) be the rotation matrix from frame \(\mathcal {H}_i\) at time \(t_2\), to frame \(\mathcal {H}_i\) at time \(t_1\), defined as:

$$\begin{aligned} \mathbf {R}_{\mathcal {H}_i(t_1) \mathcal {H}_i(t_2)} = \left[ {\begin{array}{*{20}{c}} {cos({\varDelta }\psi _i |_{t_1}^{t_2})}&{}{-sin({\varDelta }\psi _i |_{t_1}^{t_2})} \\ {sin({\varDelta }\psi _i |_{t_1}^{t_2})}&{}{cos({\varDelta }\psi _i |_{t_1}^{t_2})} \end{array}} \right] , \end{aligned}$$

\({\varDelta }\psi _i |_{t_1}^{t_2}\) is the change in heading angle for MAV i from time \(t_1\) to time \(t_2\), which can be calculated as:

$$\begin{aligned} {\varDelta }\psi _i |_{t_1}^{t_2} = \int \limits _{t_1}^{t_2} {r_i(t)dt} \end{aligned}$$

The current positional error for the follower MAV 1, depicted in Fig. 15, can be defined as:

$$\begin{aligned} \mathbf {e}(t_n) = \mathbf {R}_{\mathcal {H}_1(t_n) \mathcal {H}_1(t_n-\tau )} \left( \mathbf {p}(t_n-\tau )-{\varDelta }\mathbf {p}_{t_n-\tau }^{t_n} \right) \end{aligned}$$

The vector \({\varDelta }\mathbf {p}_{t_n-\tau }^{t_n}\) represents how much the follower has moved from time \(t_n-\tau \) until \(t_n\) as defined in frame \(\mathcal {H}_1(t_n-\tau )\). This vector can be calculated using information available to the follower:

$$\begin{aligned} {\varDelta }\mathbf {p}_{t_n-\tau }^{t_n} = \int \limits _{t_n-\tau }^{t_n} {\mathbf {R}_{\mathcal {H}_1(t_n-\tau ) \mathcal {H}_1(t)}\mathbf {v_1}(t)dt} \end{aligned}$$

Finally, one more piece of information is needed in order to be able to design a control law for the follower MAV, which is the model of the follower MAV and how it responds to control inputs. In this paper, it is assumed that the MAV already has stable inner loop control running on board, such that it directly can take velocity commands. It is further assumed that with the inner loops in place, the MAV responds like a very simple first order delay filter to velocity commands, such that the differential equation for the velocity becomes:

$$\begin{aligned} \dot{\mathbf {v_1}} = \varvec{\tau }^{-1}(\mathbf {v_{1c}}-\mathbf {v_1}) \end{aligned}$$

where \(\varvec{\tau }^{-1}\) is a diagonal matrix. The values along the diagonal of \(\varvec{\tau }^{-1}\) are the inverse of the time constants that characterize the delay of the system with respect to a control input \(\mathbf {v_{1c}}\). This is only an approximation of how the actual MAV behaves, but it will be shown to be sufficient to accomplish the desired behavior.

With all this information in place, a control law can be designed. The control law is designed using Nonlinear Dynamic Inversion (NDI) principles. In order to use NDI, a state space model is required for the situation at hand. A very similar state space model to the one used for the relative localization filter can be used. Define the state vector as:

$$\begin{aligned} \bar{\mathbf {x}} = \left[ \mathbf {e}^\intercal , {\varDelta }\bar{\psi }, \mathbf {v_1}^\intercal , \bar{\mathbf {v_2}}^\intercal \right] ^\intercal \end{aligned}$$

The state vector is similar to the one defined before for the relative localization filter, with a few small changes. First of all, \(\mathbf {e}=\mathbf {e}(t)\) represents the current positional error for the follower MAV 1 with respect to the leader’s old position. Secondly, \({\varDelta }\bar{\psi }\) and \(\bar{\mathbf {v_2}}^\intercal \) represent again the difference in heading between two MAVs and the velocity of MAV 2, except now \({\varDelta }\bar{\psi }\) is the difference in heading between frame \(\mathcal {H}_1(t)\) and \(\mathcal {H}_2(t-\tau )\), and \(\bar{\mathbf {v_2}}^\intercal \) is the delayed leader’s velocity at time \(t-\tau \), such that \(\bar{\mathbf {v_2}}^\intercal = \mathbf {v_2}(t-\tau )\).

Similarly, define a new input vector as:

$$\begin{aligned} \bar{\mathbf {u}} = \left[ \mathbf {v_{1c}}^\intercal , \bar{\mathbf {a_2}}^\intercal , r_1, \bar{r_2} \right] ^\intercal \end{aligned}$$

where \(\mathbf {v_{1c}}\) is the actual control input fed to MAV 1, and \(\bar{\mathbf {a_2}}\) and \(\bar{r_2}\) represent the same values as \(\mathbf {a_2}\) and \(r_2\), except that they are delayed versions. Therefore \(\bar{\mathbf {a_2}}=\mathbf {a_2}(t-\tau )\) and \(\bar{r_2} = r_2(t-\tau )\).

Finally, a new set of state differential equations can be defined as:

$$\begin{aligned} \dot{\bar{\mathbf {x}}} = \bar{\mathbf {f}}(\bar{\mathbf {x}},\bar{\mathbf {u}})= \left[ {\begin{array}{*{20}{c}} {-\mathbf {v_1}+\bar{\mathbf {R}}\bar{\mathbf {v_2}}-\mathbf {S_1 e}} \\ {\bar{r_2}-r_1}\\ {\varvec{\tau }^{-1}(\mathbf {v_{1c}}-\mathbf {v_1})}\\ \bar{\mathbf {a_2}}-\bar{\mathbf {S_2}} \bar{\mathbf {v_2}} \end{array}} \right] \end{aligned}$$

where \(\bar{\mathbf {R}}=\mathbf {R}({\varDelta }\bar{\psi })\) and \(\bar{\mathbf {S_2}}=\mathbf {S_2}(\bar{r_2})\).

The state that we wish to control is the current positional error that MAV 1 has with respect to the delayed leader’s position, so the state \(\mathbf {e}\). This state can be represented as:

$$\begin{aligned} \mathbf {e} = \mathbf {H} \bar{\mathbf {x}} \end{aligned}$$

With \(\mathbf {H}\) given by:

$$\begin{aligned} \mathbf {H} = \left[ {\begin{array}{*{20}{c}} {\mathbf {I}_{2\times 2} }&{\mathbf {0}_{2\times 5}} \end{array}} \right] \end{aligned}$$

The derivative of the control variable with respect to time is equal to:

$$\begin{aligned} \dot{\mathbf {e}} = \mathcal {L}_{\bar{\mathbf {f}}}^1 \mathbf {e} = \mathbf {H} {\bar{\mathbf {f}}} = -\mathbf {v_1}+\bar{\mathbf {R}}\bar{\mathbf {v_2}}-\mathbf {S_1 e} \end{aligned}$$

The second derivative of the control variable:

$$\begin{aligned} \ddot{\mathbf {e}}&= \mathcal {L}_{\bar{\mathbf {f}}}^2 \mathbf {e} = (\nabla \otimes \dot{\mathbf {e}}) \cdot \bar{\mathbf {f}} \nonumber \\&= \left[ {\begin{array}{*{20}{c}} {-\mathbf {S_1}}&{\frac{\partial \bar{\mathbf {R}}}{\partial {\varDelta }\bar{\psi }}\bar{\mathbf {v_2}}}&{-\mathbf {I}_{2\times 2}}&{\bar{\mathbf {R}}} \end{array}} \right] \cdot \bar{\mathbf {f}} \nonumber \\&= -\mathbf {S_1} \left( -\mathbf {v_1}+\bar{\mathbf {R}}\bar{\mathbf {v_2}}-\mathbf {S_1 e} \right) + \frac{\partial \bar{\mathbf {R}}}{\partial {\varDelta }\bar{\psi }}\bar{\mathbf {v_2}} \left( \bar{r_2}-r_1 \right) \nonumber \\&\quad -\mathbf {I}_{2\times 2} \left( \varvec{\tau }^{-1}(\mathbf {v_{1c}}-\mathbf {v_1}) \right) + \bar{\mathbf {R}} \left( \bar{\mathbf {a_2}}-\bar{\mathbf {S_2}} \bar{\mathbf {v_2}} \right) \nonumber \\&= \mathbf {D}\mathbf {v_{1c}} + \mathbf {b(x,u)} \end{aligned}$$

With \(\mathbf {D}\) equal to:

$$\begin{aligned} \mathbf {D} = -\mathbf {I}_{2\times 2} \varvec{\tau }^{-1} \end{aligned}$$

and \(\mathbf {b(x,u)}\) equal to:

$$\begin{aligned} \mathbf {b(x,u)}= & {} -\mathbf {S_1} \left( -\mathbf {v_1}+\bar{\mathbf {R}}\bar{\mathbf {v_2}}-\mathbf {S_1 p} \right) + \frac{\partial \bar{\mathbf {R}}}{\partial {\varDelta }\psi }\bar{\mathbf {v_2}} \left( \bar{r_2}-r_1 \right) \nonumber \\&+\,\mathbf {I}_{2\times 2} \varvec{\tau }^{-1}\mathbf {v_1} + \bar{\mathbf {R}} \left( \bar{\mathbf {a_2}-S_2 v_2} \right) \end{aligned}$$

This can further be reduced to:

$$\begin{aligned} \mathbf {b(x,u)}= & {} -\mathbf {S_1} \left( -\mathbf {v_1}+\bar{\mathbf {R}}\bar{\mathbf {v_2}}-\mathbf {S_1 p} \right) \nonumber \\&- \frac{\partial \bar{\mathbf {R}}}{\partial {\varDelta }\psi }\bar{\mathbf {v_2}} r_1 +\mathbf {I}_{2\times 2} \varvec{\tau }^{-1}\mathbf {v_1} + \bar{\mathbf {R}} \bar{\mathbf {a_2}} \end{aligned}$$

At this point the following control law can be chosen:

$$\begin{aligned} \mathbf {v_{1c}} = \mathbf {D}^{-1} (\mathbf {i} - \mathbf {b(x,u)}) \end{aligned}$$

with \(\mathbf {i}\) now a virtual control input.

This control law results in a fully linearized differential equation for the positional error of the follower, since substitution of the control law from Eq. 55 in Eq. 51 results in the following differential equation:

$$\begin{aligned} \ddot{\mathbf {e}} = \mathbf {i} \end{aligned}$$

Which can be shown to be exponentially stable if the following virtual control is implemented:

$$\begin{aligned} \mathbf {i}&= -K_p \mathbf {e} - K_d \dot{\mathbf {e}} \end{aligned}$$
$$\begin{aligned}&K_p, K_d > 0 \end{aligned}$$

4.3 Experimental set-up

One of the main findings in the observability study and the simulation results is that the localization error scales more steeply with range noise for system \(\sum _B\) than for \(\sum _A\). It is therefore important to use sensors that can provide accurate relative ranging measurements.

In this work, we chose to use Ultra Wide Band (UWB) based radio transceivers. UWB has recently gained attention within the domain of ranging. UWB signals are characterized by their fine temporal and spatial resolution (Correal et al. 2003), which leads UWB based systems to be able to, for example, resolve multipath effects more easily (Win and Scholtz 1998). Ultimately, this leads to an accurate ranging performance, which is important if using the heading independent filter. Another advantage of UWB is its relative robustness to interference from other radio technologies due to the fact that it operates on an (ultra) wide range of frequencies (Liu et al. 2007; Foerster et al. 2001; Molisch et al. 2006).

The UWB ranging hardware used in the experiments is the ScenSor DWM1000 module sold by Decawave.Footnote 4 The ranging algorithm that is employed is a particular implementation of the Two-Way Ranging (TWR) method (Neirynck et al. 2016). In order to fuse ranging data with velocity, acceleration, height, and yaw rate data in the localization filter, these variables are also communicated between MAVs by using the UWB devices. The same UWB messages used in the TWR protocol are also used to communicate these variables.

The UWB module transceiver has been installed on the Parrot Bebop 2 platform.Footnote 5 The Bebop 2 runs custom autopilot software designed using the open-source autopilot framework Paparazzi UAV.Footnote 6 Paparazzi UAV provides the stable inner loop control loops for the Bebop 2 using Incremental NDI (INDI, Smeur et al. (2015)). This allows us to control the outer loop by giving the computed velocity commands to the INDI inner loops.

Velocity and height measurements are also necessary for the relative localization filter. In the initial experiments, they are provided by an overhead motion capture system (MCS) by OptiTrack.Footnote 7 In a second iteration of the experiment, they are fully provided by on-board sensors. The velocity data is obtained from the MAVs’ on-board bottom-facing camera using Lucas–Kanade optical flow. Height is measured using an on-board ultrasonic sensor that the Bebop 2 is equipped with by default. At all times, the acceleration and yaw rate measurements are obtained from the MAVs’ on-board accelerometers and gyroscope, respectively. The experiments are first conducted with two MAVs (one leader and one follower), detailed in Sect. 4.4, and then performed again with three MAVs (one leader and two followers), detailed in Sect. 4.5.

4.4 Leader–follower flight with one follower

The experiment with one follower MAV consists of one Bebop 2 following another Bebop 2 using the control law presented in Sect. 4.2. At first, right after take off, the MAVs fly concentric circles just like the ones shown in Fig. 11. This procedure is there to make sure that the EKF running on-board the MAVs has time to converge to the correct result, such that by the time the follower MAV is instructed to start following the leader, the follower has a correct estimate of the relative location of the leader.

When leader–follower flight is engaged, the trajectory of the leader has been designed to sufficiently excite the the relative localization filter during the leader–follower flight and to decrease the likelihood of being stuck in unobservable states. This has been done by introducing frequent turns in the trajectory to have changing relative velocities and accelerations. The follower is instructed to follow the leader’s trajectory with a time delay of \(\tau =5~\mathrm{s}\).

It is important to note that, for safety reasons, the norm of the follower’s commanded velocity \(||\mathbf {v_{1c}}||_2\) during both experiments is saturated at \(1.5~\mathrm{m/s}\). The measure is taken because the MAVs were flying in a relatively small confined area (\(10~\mathrm{m}\) by \(10~\mathrm{m}\)). This change does however have consequences for the performance of the follower’s tracking, which is discussed further in the next sections.

4.4.1 Leader–follower flight with velocity and height information from a MCS

First, the case where velocity and height information is provided by the MCS is studied. In Fig. 16, the trajectory flown by the follower is compared to the trajectory of the leader. The x and y coordinates are compared separately for part of the flight in Fig. 17a, b. In Fig. 18, a time composition of overhead camera images is given for \(5~\mathrm{s}\) of flight as an illustration. The follower’s position is shown at seven time instances during these \(5~\mathrm{s}\), and is compared to the leader’s trajectory.

Fig. 16
figure 16

The trajectories of leader and follower during experiment with MCS height and velocity

Fig. 17
figure 17

The trajectory of the follower in the a x- and b y-coordinate, compared to the delayed trajectory of the leader for the experiment with MCS height and velocity

Fig. 18
figure 18

Time composition of overhead camera images of leader and follower MAV in time, for the experiment with MCS height and velocity. Indicated in red and marked by \(\mathbf {p_2}(t)\), is part of the leader’s trajectory. The leader’s final position is indicated by \(\mathbf {p_2}(t=0)\). Seven points in time of the follower’s trajectory are indicated in the image. According to the control objective, \(\mathbf {p_1}(t=5)\) should equal \(\mathbf {p_2}(t=0)\) (Color figure online)

A total of \(200~\mathrm{s}\) of leader–follower flight were logged and will be analyzed here. During this time, several laps of the designed trajectory were executed. The trajectories in Figs. 16,  17 and 18 indeed show that the follower is successfully tracking a delayed version of the leader’s trajectory. The actual error distribution for the norm of the relative location estimate \(||\mathbf {p}||_2\) is shown in Fig. 19. The errors have a mean value of \(18.4~\mathrm{cm}\) and a maximum value of \(77.5~\mathrm{cm}\), at maximum inter-MAV distances up to \(5~\mathrm{m}\).

Fig. 19
figure 19

Histogram of the localization error for the follower during experiment with MCS height and velocity

Since, in this experiment, the velocity and height measurements were provided with high accuracy by the MCS, one would expect the primary source for the localization error to be the ranging error from the UWB modules. However, inspection of the ranging error actually shows a pretty favorable error distribution. A histogram of the ranging error throughout the flight is given in Fig. 20. The mean of the ranging error is close to zero (about \(-\,6.4~\mathrm{cm}\)) and the errors are well distributed around this mean. This is therefore not the main cause of the occasionally higher relative localization errors.

Fig. 20
figure 20

Histogram of the ranging error during experiment with MCS height and velocity

The most clearly identifiable cause for the relative localization error is the occasional dropping of frames by the UWB modules. The update rate of the relative localization filter is equivalent to the UWB messaging rate, because the filter is updated every time that the UWB modules produce a new ranging result (using a callback function). For two UWB modules, this corresponds to an update rate of about \(25~\mathrm{Hz}\), corresponding to a time step of approximately \(40~\mathrm{ms}\). However, the modules occasionally drop frames, causing the time step to spike up. Over the flight, 2% of all messages were received following an interval of more than \(40~\mathrm{ms}\), and 1% of all messages were received following an interval of \(200~\mathrm{ms}\). In one instance, the interval reached \(470~\mathrm{ms}\), an order of magnitude larger than the average. It is not hard to imagine the unfavorable effect that such events can have for the relative localization estimate. It is therefore not coincidental that the largest localization error recorded during the flight also corresponds to one of those times where the UWB modules dropped frames, causing the update rate of the relative localization filter to also drop.

We now turn our attention to the tracking error of the follower MAV. The tracking error distribution \(||\mathbf {e}||_2\) is given in Fig. 21. The mean of the distribution is equal to \(46.1~\mathrm{cm}\) and the maximum error is \(1.32~\mathrm{m}\). Of course, part of this error is caused by a relative localization error from the follower’s perspective, which will inevitably affect the tracking performance. However, since the relative localization error is considerably lower than the tracking error, there must be more sources to the error.

Fig. 21
figure 21

Histogram of the tracking error \(||\mathbf {e}||_2\) for the follower during experiment with MCS height and velocity

One source of error is the fact that the follower’s response to a velocity command \(\mathbf {v_{1c}}\) is modeled as a first order delay. In reality, the MAV has some overshoot with respect to commands, which is not modeled by this first order delay. This model mismatch by itself might not be that harmful to the performance, since the control law would respond with more aggressive velocity commands as a reaction to the MAV not behaving as modeled. However, the control law’s freedom is severely restricted by the command saturation at \(1.5~\mathrm{m/s}\), which means that the follower cannot move as fast as the command law demands. This argument is further supported by a qualitative analysis of the follower’s trajectory with respect to the leader’s trajectory in Fig. 16. The trajectory of the follower often seems to take ‘shortcuts’ with respect to the leader’s trajectory. This falls in line with the expected behavior due to the command saturation. The control law is designed not only to track the trajectory of the leader in space, but also in time. As the follower starts lagging behind the leader more than the desired \(\tau =5~\mathrm{s}\), the follower starts to take shortcuts in the trajectory to catch up with the leader. This error would be less prevalent if the command saturation were increased.

4.4.2 Leader–follower flight with only on-board measurements

We now demonstrate the workings of the proposed methods in this paper when only on-board sensing is used. In this set-up, the follower MAV does not use any MCS information. Instead, the velocity information comes from Lucas–Kanade optical flow measurements while the height is derived from the on-board ultrasonic sensor. Similarly, the leader MAV directly communicates optical flow velocities and ultrasonic height measurements (along with accelerations and yaw rate from the IMU) to the follower MAV for use in the relative localization filter. The MCS is only used to log ground truth data and for the leader to safely fly its trajectory. No MCS data is used by the follower at all. Again, \(200~\mathrm{s}\) of leader–follower flight with full on-board sensing took place successfully and will be analyzed here.

The trajectory of the follower with respect to the delayed leader’s trajectory is compared in Figs. 22 and 23. Furthermore, another time composition for \(5~\mathrm{s}\) of flight where the follower is tracking the leader is given in Fig. 24.

The main qualitative difference with respect to the situation where the MCS was still used for velocity and height information is the fact that the follower’s trajectory appears less smooth. Otherwise, the performance seems qualitatively similar. The follower still appears to take ‘shortcuts’ with respect to the leader’s trajectory, although the increased disorder in the follower’s trajectory makes this less apparent.

Fig. 22
figure 22

Trajectory of leader and follower during experiment with only on-board sensing and processing

Fig. 23
figure 23

The trajectory of the follower in the a x- and b y-coordinate, compared to the delayed trajectory of the leader for the experiment with only on-board sensing

Fig. 24
figure 24

Time composition of overhead camera images of leader and follower MAV in time, for the experiment with only on-board sensing. Indicated in orange and marked by \(\mathbf {p_2}(t)\), is part of the leader’s trajectory. The leader’s final position is indicated by \(\mathbf {p_2}(t=0)\). Six points in time of the follower’s trajectory are indicated in the image. According to the control objective, \(\mathbf {p_1}(t=5)\) should equal \(\mathbf {p_2}(t=0)\) (Color figure online)

The tracking error distribution for the on-board sensing case is given in Fig. 25. The mean tracking error is \(50.8~\mathrm{cm}\) and the maximum error is \(1.47~\mathrm{m}\). The relative localization error is given in Fig. 26. Here, the mean error is \(22.6~\mathrm{cm}\) and the maximum error is \(75.8~\mathrm{cm}\), at maximum MAV distances up to \(5.2~\mathrm{m}\).

The performance when using only on-board sensing is very similar to when using the MCS for height and velocity data. This can be mainly attributed to the fact that the measurements that have been replaced (the height and velocity of both MAVs) are actually also accurately measured on-board.

The primary reason as to why the trajectory of the follower with on-board sensors still seems slightly more disordered is the fact that the follower has difficulty to accurately control its altitude when using only on-board sensing. The follower now purely relies on height measurements from its ultrasonic sensor. The update rate of this sensor is low, and in between measurements the follower uses (noisy) accelerometer data to update its height. This sometimes causes the follower to believe its altitude is different than it really is, causing it to rapidly ascend or descend. This takes up thrust, restricting the follower’s ability to maneuverer accurately in the horizontal plane due to thrust saturation.

Fig. 25
figure 25

Histogram of the tracking error \(||\mathbf {e}||_2\) for the follower during experiment with only on-board sensing and processing

Fig. 26
figure 26

Histogram of the localization error for the follower during experiment with only on-board sensing and processing

4.5 Leader–follower flight with two followers

To demonstrate that the methods in this paper can also scale to more than one follower, the leader–follower flight is also performed with two follower MAVs instead of one. This is done both with MCS height and velocity data and with only on-board sensing.

For this purpose, The UWB messaging protocol is adapted to allow every MAV to perform ranging with every other MAV. The MAVs also communicate a unique (pre-assigned) identification number within the UWB messages. The followers can use this identification number to determine which messages originate from the leader so that they individually keep track of the leader as before. In the implemented communication protocol, the UWB modules on the MAVs take turns to message their data to the others. This causes a drop in communication rate every time that a new UWB module is introduced. For this reason, the UWB range update rate reduced from about \(25~\mathrm{Hz}\) with 2 MAVs, to about \(16~\mathrm{Hz}\) with 3 MAVs (i.e., with an interval of \(62.5~\mathrm{ms}\)). Additionally, the introduction of the additional module was accompanied with an increase in the communication outages. 5.8% of all messages recorded during our flight were received with an interval longer than the nominal one of \(62.5~\mathrm{ms}\). 2.9% of all messages were received with an interval of \(100~\mathrm{ms}\) or more. The messaging intervals for the flight with two MAVs and three MAVs are compared in Fig. 27.

Fig. 27
figure 27

Messaging rate over flights with 2 and 3 MAVs

This time, due to the lack of space available, there is no initialization flight procedure to give the EKFs of the followers time to converge. Instead, the MAVs are placed in starting positions and orientations that roughly match with what EKFs on-board the MAVs are initialized to. Although this placement is done purely by eye, it is proven to be sufficient to safely start the leader–follower flight.

The leader flies the same trajectory as before. The first follower follows this trajectory with a \(\tau =4~\mathrm{s}\) delay, and the second follower follows it with an \(\tau =8~\mathrm{s}\) delay. Once again, \(200~\mathrm{s}\) of successful flight data is logged and analyzed.

An overhead camera image for the flight with MCS height and velocity data is presented in Fig. 28, giving an idea of how the experiment looked like.Footnote 8 The trajectories for this flight are displayed in Fig. 29 for the leader and two followers. For the flights with only on-board information, the trajectories are shown in Fig. 30.

As for the case with just one follower, we see that the followers tend to take shortcuts with respect to the leader’s trajectory. Furthermore, the flights using only on-board information are less smooth than those with MCS height and velocity information. For the flight with MCS data, follower 1 has a MAE for the relative localization error of only \(15.8~\mathrm{cm}\). By comparison, follower 2 has a MAE of \(43.9~\mathrm{cm}\). Furthermore, followers 1 and 2 have MAE for the tracking of \(42.9~\mathrm{cm}\) and \(70.3~\mathrm{cm}\), respectively. The flight with only on-board sensing resulted in a relative localization MAE of \(51.8~\mathrm{cm}\) and \(53.6~\mathrm{cm}\). The tracking MAE this time was \(58.6~\mathrm{cm}\) and \(98.4~\mathrm{cm}\).

Fig. 28
figure 28

Overhead camera image of leader and two followers using MCS height and velocity. In orange is the leader’s trajectory marked at \(0.5~\mathrm{s}\) intervals (Color figure online)

Fig. 29
figure 29

Trajectory of leader and two followers using MCS height and velocity

Fig. 30
figure 30

Trajectory of leader and two followers using only on-board information

4.6 Comparison of flights

In this section we present the relative localization and tracking MAE of the various flights that were executed. We also discuss in more detail the most noteworthy differences between experiments.

Table 2 Comparison of mean localization (loc.) errors and mean tracking (track.) errors for all performed experimental flights, both for MCS and fully on-board (on-b.) flights

All the errors are presented in Table 2. The first noteworthy observation is the fact that, for the experiment with two followers, the tracking performance of the second follower is worse than for the first follower in both the MCS and fully on-board case. This is a byproduct of the fact that the proposed leader–follower control method inherently relies on integration of velocity information in time. As the delay with which the follower must follow the leader increases, so does the period of time over which the follower must integrate its velocity. This is subject to drift, which shows in the tracking performance. This effect is more noticeable in the fully on-board case, since the velocity estimates from optical flow methods are less accurate than the ones computed by the MCS.

Another result is that the localization error for follower 2 in the MCS case is higher than for the first follower. This can be explained, in part, by the fact that follower 2 has a larger mean range with respect to the leader than follower 1 does (\(4.2~\mathrm{m}\) compared to \(2.9~\mathrm{m}\)). To inspect this deeper, we looked at the logged range between the MAVs. It was found that follower 2 had substantially larger ranging errors with the leader than follower 1. This can be appreciated in Fig. 31, where the ranging error distributions are compared. In both cases, the mean is close to zero, yet the distribution for follower 2 is significantly wider. An investigation of the flight logs revealed that this is most likely associated with a combination of antenna orientation and relative flight trajectory. If the error is analyzed, it can be seen that it is subject to periodic peaks which match the period of the relative bearing between the drones. This can be seen in Fig. 32. However, this effect does not appear to be purely caused by relative bearing, but rather a combination of relative bearing and relative location of the drones, relating to how the antennas were mounted on the drones, whereby the presence of the drone itself likely compromised the signal. This also explains why the correlation is most clear during the first \(100~\mathrm{s}\) of flight and the last 50 s, but it is less clear between 90 and \(150~\mathrm{s}\) of the log. If the trajectory is analyzed, between 100 and \(150~\mathrm{s}\) is when the follower 2 trajectory started varying slightly (follower 2 began to take ‘shortcuts’) bringing the drones at different relative locations to eachother. Such correlations should be investigated further in future work over a variety of flights with different platforms and scenarios.

Fig. 31
figure 31

Comparison between ranging error distributions for follower 1 and 2 for the flight with MCS height and velocity data

Fig. 32
figure 32

Range error and relative bearing between leader and follower 2 during flight

A final result that stands out is that both followers 1 and 2 have substantially higher localization errors in the on-board case than was found for the on-board experiment with a single follower. This result appears to be due to a combination of factors. The increased communication traffic caused a decrease in the filter update rate and also resulted in an increase in ranging frames dropped. Follower 2, as mentioned above, showed a worse ranging performance than follower 1. Follower 1, in turn, had slightly less accurate optical flow velocity estimates than were obtained with the single follower flight (\(21~\mathrm{cm/s}\) MAE compared to \(15~\mathrm{cm/s}\) before) and also slightly higher ranging errors than for the single follower flight (\(15~\mathrm{cm}\) MAE compared to \(8~\mathrm{cm}\) before). All factors combined, both followers suffered a comparable degradation in localization performance.

5 Discussion

In this section we revisit the observability analysis from Sect. 2 with the obtained experimental data. We also present some remarks on the scalability of this methodology to larger groups of MAVs.

5.1 Remarks on observability

Section 2.5 showed that for a specific set of velocities, accelerations and relative positions for both MAVs, the system will become unobservable. To directly integrate the full observability condition in the design of a leader–follower system is difficult due to its high dimensionality. By having followers fly a delayed version of the leader’s trajectory, it is possible to naturally vary the relative positions between leader and follower, as long as the leader’s velocity changes in time. Given the sparsity of unobservable relative positions, we therefore postulated that this control behavior would be sufficient to limit unobservable situations. Furthermore, even if an unobservable situation were to occur, this would only be for a short period of time, as the relative position continuously changes and the system automatically transitions back to being observable.

Having performed the experiments and collected all the ground truth data, it is now possible to test whether this assumption is valid. All the parameters needed to evaluate Eq. 33 have been logged during the experiments and can be inserted into Eq. 33 to check the observability of the relative localization filter in time. In line with our previous analysis, the measure of observability of the system is represented by the cross product between the left hand side of Eq. 33 and the relative position vector \(\mathbf {p}\). Once more, we shall take a threshold of 1. Although theoretically only a value of 0 would indicate an unobservable system, the higher threshold is chosen to account for noise in the data.

With the chosen threshold, the unobservable data points for the MCS and the on-board flight are 4.76% and 4.75% of all the data points, respectively. The unobservable points are spread in time, thus giving the system ample observable data in between to recover from the short periods of unobservability. Furthermore, isolated events of unobservability are not expected to cause issues. Instead, they can gradually cause an increase in the localization error in time. This has also been confirmed by the simulations in Sect. 3.

Further qualitative inspection of the data does not show a correlation between the unobservable regions of the flight and the relative localization error. To demonstrate this, the localization error is compared to the observability of the filter in Fig. 33 for a small segment of the flight with MCS information. For easier comparison, the observability has been reduced to a binary value, where a value of ‘1’ indicates that the system is within the threshold of unobservability at that time. It can be seen that there is no apparent correlation between the two parameters.

The relative localization insights in this paper have been aimed at leader–follower flight, yet they extend to other applications of MAVs in the real world. For different scenarios such as area coverage, where the relative motion between MAVs may be more (seemingly) random, it is expected that unobservable conditions would be more rare (Cornejo and Nagpal 2015). Therefore, based on our results, we expect that the relative localization performance would also not suffer from unoboservable conditions even in other tasks.

Fig. 33
figure 33

Comparison between localization error and the observability of the filter. An unobservable value of ’1’ means the observability measure is within the threshold of unobservability

5.2 Remarks on scalability

The experimental results in Sect. 4 show that the methods in this paper can successfully scale to two followers that follow a leader in a confined area. Even when full on-board sensing is used by the followers, more than three minutes of successful autonomous flight were demonstrated, with no pilot input.

Despite the successful results, analysis of the data does show a substantial rise in localization and tracking errors when scaling up to two MAVs. This raises the question of what would happen if even more MAVs are added to the experiment; would this be viable?

One of the results we found is that there is a correlation between the tracking performance of the follower and the time delay with which it follows the leader’s trajectory. The follower that tracked with a time-delay of \(8~\mathrm{s}\) showed consistently larger tracking errors than the followers with \(4~\mathrm{s}\) and \(5~\mathrm{s}\) delays. An alternative solution to the two follower problem is having one follower follow the leader and the other following the first follower. With such an arrangement, both followers could follow another MAV with the same time delay. This setup has not yet been studied in this work, but could prove to be a better alternative to explore in future research.

In our experiments, the update rate reduced when flying with two followers instead of one. It is to be expected that adding more MAVs requires additional data communication, yet a drop from 25 to \(16~\mathrm{Hz}\) is quite significant for adding just one more MAV. In this case, the reduction was due to the communication protocol used during the experiments. In future work there should be efforts to determine how to tackle this, which is a necessary step in order to solve scalability issues that will otherwise arise when introducing even more UWB modules.

As an example, it should be possible to significantly increase the messaging rate to allow for more drones. In these experiments we operated the UWB modules on the lowest data rate settings (\(110~\mathrm{kbps}\)). Furthermore, every message contains a lengthy preamble of \(2048~\mathrm{bits}\), resulting in substantial protocol overhead for every transmitted message which may not be necessary (the actual payload of the UWB messages is less than \(200~\mathrm{bits}\)). The maximum data rate that the UWB modules support is actually \(6.8~\mathrm{Mbps}\) and the preamble can be as short as \(64~\mathrm{bits}\). These would allow for much higher update rates, even with three or more MAVs. One would, however, need to examine what such a change would have on ranging accuracy and stability.

6 Conclusion

The work in this paper has shown the feasibility of heading-independent range-based relative localization on MAVs. We now know that removing the dependency on a common heading between MAVs has two main disadvantages: the motion of agents must meet more stringent conditions to be observable and the relative localization becomes more susceptible to noise on the range measurements. The clear advantage, on the other hand, is that the filter is no longer affected by local disturbances in Earth’s magnetic field. As shown by our simulations, small magnetic perturbations can already lead to a large negative impact, showing how a heading-independent method can actually perform better than the heading-dependent method.

The results of our observability analysis have shown that leader–follower flight is a difficult task when using the proposed relative localization method, where a simple fixed geometry formation flight is not possible. Instead, we needed to develop a method that allows one MAV to follow another MAV’s trajectory with a certain time delay while the leader flies in a curved trajectory. This approach has been shown to stay sufficiently clear from unobservable conditions, which has allowed us to successfully demonstrate leader–follower flight in practice.

Using only on-board sensory information, one MAV can localize another MAV with a mean error of just \(22.6~\mathrm{cm}\) over \(200~\mathrm{s}\) of leader–follower flight. This consequently allows the MAV to track another MAV’s trajectory with a mean error of \(50.8~\mathrm{cm}\). The method has been demonstrated to work also with two followers tracking the same leader.

In a wider context, this work showcases a fundamental connection between relative localization and behavior for teams (or swarms) of robots. We have shown that the constraints included in the observability analysis have to be taken into account when designing the behavior of the robots. This enables the robots to make a better use of their sensors, which in turn provides for a better final performance. For example, in our case, the intuitive conditions extracted from the observability analysis informed us that the leader–follower behavior should not be such that the MAVs fly in a fixed geometry. In general, extracting such intuitive conditions can help swarm designers understand, at a higher level, how the behavior of the individual robots should be designed in order to be in harmony with their relative localization sensors.

7 Future work

There are plenty of opportunities to research within the domain of range based relative localization. Certainly, one such opportunity is the initial convergence behavior of the filter. The initial estimate of the EKF is important to quickly converge to a correct estimate of the relative location of another MAV. If the initial condition is too different from the real situation, the filter has difficulties to converge. One primary problem is that there exist spurious states where the EKF can inititially erraneously converge to. In the future, it would thus be interesting to research methods to address this problem. Some possible solutions could be to use of more thorough estimation filters (e.g., particle filters), or to run multiple filters leading to multiple ambiguous states, which would then help to identify the correct estimate more easily. Furthermore, with an eye on scalability to larger swarms, it would be valuable to explore more thoroughly whether the less intuitive unobservable conditions are, as indicated by our analysis, indeed significantly more unlikely than the observable ones.

It would also be valuable to research alternative control algorithms to enable the leader–follower flight. A weakness of the current controller is that it stores and uses the entire most recent portion of the leader’s trajectory in order to replicate it with a certain delay, which is not memory efficient. An alternative solution might be to perform real time polynomial data fitting on the relative positions of the leader. The resulting polynomial trajectories could be used to obtain the velocities and accelerations through analytical derivations of the polynomials. This might result in less data that needs to be stored on-board of the MAVs and also might lead to smoother trajectories. Moreover, currently we require the leader to fly in an oscillatory trajectory in order to help the followers avoid unobservable states. However, the controller on-board of the followers could also be such that their trajectory is automatically adapted in order to preemptively avoid unobservable conditions. This would put less requirements on the leader, which would then be free to fly any type of trajectory, and would also be a more general and robust solution. To this end, the leader could also communicate additional information such as its planned trajectory over a time horizon.

Finally, considering the hardware used in the experiments, the importance of consistent, high frequency communication and ranging has become apparent. It would be valuable to further optimize the frequency and consistency with which ranging messages are exchanged.

8 Videos

Videos of the experiments can be found at: https://www.youtube.com/playlist?list=PL_KSX9GOn2P--aEr4JtFl7SV3LO5QZY4q