1 Introduction

The small size, agility, and low upfront costs of Micro Aerial Vehicles (MAVs) could instigate their widespread use and rapid deployment for payload transport in areas that are inaccessible or dangerous for humans and conventional (aerial) vehicles. Current applications for MAVs with slung payloads (the MAVP system) include search and rescue (Ryan and Hedrick 2005), package/cargo delivery, and construction (Lee 2018) primarily in large, rural, obstacle-free spaces.

Operation of MAVPs in urban settings presents itself with notable challenges given the complex and dynamic environment within which they would operate. A MAVP system is required to be able to quickly, safely, and autonomously navigate an obstacle-ridden space while adapting to different situations. Carriage of a swinging payload vastly increases the system’s spatial footprint making operation in restrictive spaces challenging. In such situations MAV trajectory planning and control is necessary to generate the desired swing motions to avoid collisions with obstacles. Failing to acknowledge the system’s future response when performing agile flight could result in inevitable collisions as by the time an obstacle is to be avoided, the MAV might be unable to divert the swinging payload away. Working around the problem, one may pre-generate trajectories for fully defined environments (and thus static), or actively minimise swing to reduce the system’s dynamic response, however, we will demonstrate that these undermine the real-world practicality of the approaches in dynamic environments.

1.1 Contributions

Our main contribution is an online local motion planner and controller for the safe, agile, and collision-free flight of a MAVP system in dynamic, multi-obstacle settings. Our method is formulated as a constrained optimisation using a finite-horizon Non-linear Model Predictive Control (NMPC). The optimisation problem can be efficiently solved thanks to contemporary fast solvers.

A full system framework is outlined integrating the NMPC controller in a combined hardware and software based control loop. The proposed framework is verified and validated in simulated and experimental studies where we showcase our method’s scalability, adaptability, and performance over various complex tasks in static and dynamic environments. We compare it to state of the art methods and discuss the effect of computation time on the resulting system performance. Unlike previous works, we successfully demonstrate the safe operation and readiness of our method in realistic settings involving a MAVP system operating amongst multiple moving humans.

1.2 Related work

Historically, studies of aerial vehicle control with suspended payloads involved helicopter systems with applications to load transportation as shown in Cicolani and Kanning (1992), however, with the advent of MAVs the research into MAVP systems has gained traction. This paper addresses MAVP motion planning for collision avoidance which we broadly classify into two types, namely open-loop planning with feedback control, and unified closed-loop planning and control; our method contributes to the latter. We introduce contemporary approaches for both followed by a general discussion of NMPC control for MAV(P) systems and its application to closed-loop planning and control.

1.2.1 Open-loop MAVP trajectory planning

Most contemporary approaches to collision-free trajectory planning for MAVP systems have addressed the tracking of pre-generated, possibly agile, trajectories in static workspaces. We refer to these as offline, open-loop planning approaches as there is no in-the-loop dynamic re-planning of trajectories.

Using pre-generated trajectories, planar and spatial tracking of MAVP trajectories has successfully been demonstrated through accurate modelling and stabilisation of the vehicle (Feng et al. 2014; Pizetta et al. 2015) sometimes utilising swing minimisation (Bisgaard et al. 2010; Palunko et al. 2012b; Trachte et al. 2014) to mitigate coupling disturbance effects. The latter approach is energetically inefficient and over-conservative as the vehicle devotes considerable control effort to reduce swing thus resulting in a sluggish system. To accomplish desirable yet feasible MAV and payload responses, the pre-generated trajectories are computed taking the MAVP system model into account. Algorithms to achieve this have included, amongst others, optimisation and Reinforcement Learning (RL) techniques. In the former, optimal trajectories are computed as a cost minimisation problem subject to the task objectives and the MAVP model which are then encoded as full state evolutions (Foehn et al. 2017; Palunko et al. 2012a) or a reduced dimension state using differential-flatness (Sreenath et al. 2013; Tang and Kumar 2015). In RL, as used in Palunko et al. (2013), Faust et al. (2013) and Faust et al. (2017), feasible action policies (the trajectory) are generated that enforce the MAVP model on state transitions.

The main limitation of pre-generating MAVP trajectories is the reliance on task-specific full motion planning which is consequently inherently non-adaptive at run-time thus precluding handling of uncertain, dynamic obstacles. Additionally, any trajectory infeasibility at run-time is catastrophic as the planning method is unable to accommodate for this. Therefore, the aforementioned studies only address fully known environments with static obstacles limiting the practical application of these methods to limited specialised demonstrative cases. In contrast, our method is able to rapidly re-plan even if the local trajectories intermittently become infeasible due to prevailing conditions.

1.2.2 Closed-loop MAVP trajectory planning

Motion planning in dynamic environments requires re-planning at run-time to accommodate for the changing environment. Closed-loop re-planning of full motion trajectories on a global level is intractable for a high-dimensional system, such as that of a MAVP, thus necessitating the use of local planners with finite time-horizons Brock and Khatib (2000). Additionally, local planners need to cope with infeasibility during run-time while still conforming towards a higher global planning objective.

In De Crousaz et al. (2014), an agile and collision-free local trajectory generator and controller method was demonstrated in simulated and experimental setups with static obstacles using iterative Linear Quadratic Gaussian (iLQG) control. The optimal control iLQG method relies on a cost function that is minimised at every control step such that user-defined planning objectives are met; the result is a local trajectory satisfying the objectives and system dynamics. The iLQG’s iterative algorithm is exploited to generate locally optimal linear feedback controls to achieve the real-time, closed-loop performance. In a similar fashion to Tang and Kumar (2015), De Crousaz et al. (2014) apply iLQG to demonstrate impressive manoeuvres which included the flight through a narrow opening. However, the study by De Crousaz et al. (2014) did not consider planning in a dynamic environment. Furthermore, required rotor thrust inputs were generated at run-time which could saturate as the method did not account for vehicle constraints during planning. The consequence of saturating inputs over sustained time periods is the potential system instability and/or significant deviation of the actual to planned trajectory. In contrast, our approach takes into account the vehicle and environmental constraints to ensure the physical feasibility of the generated trajectories.

1.2.3 Non-linear model predictive control and unified planning and control of MAV(P)s

Early studies have successfully demonstrated the use of (N)MPC for real-time MAV (Kim et al. 2002; Tzes et al. 2012) and MAVP (Trachte et al. 2014; Gonzalez et al. 2015) simple trajectory tracking. Focussing on the latter, in Trachte et al. (2014) NMPC for MAVP trajectory tracking control was addressed with a comparison to LQR control; the results demonstrated NMPC’s superior physical constraint handling for feasibility guarantees, and larger attainable MAVP flight envelope from the non-linear MAVP model description. Overall NMPC outperformed LQR in simulated tasks involving swing minimisation and agile manoeuvres. In Gonzalez et al. (2015), studies from Trachte et al. (2014) were extended to an experimental setup validating the results, however, unlike in De Crousaz et al. (2014), both studies only addressed the control aspect of tracking pre-generated trajectories. In contrast, our method unifies the online planning and control extending the capability beyond simple tracking of a pre-defined plan approach.

Traditionally, (N)MPC algorithms for unified motion planning and control of MAVs have seldom been studied as the required real-time re-planning was computationally intractable (Neunert et al. 2016). With today’s improved computing capabilities and fast optimisation solvers applications have been demonstrated for a MAV without swung payload (Neunert et al. 2016; Naegeli et al. 2017). The method of Neunert et al. (2016) utilises the Sequential Linear Quadratic (SLQ) algorithm to solve an unconstrained optimal control problem that is paired with an external high-level planner that pre-generates feasible waypoints accounting for only static obstacles. Therefore, their method is only deployable to MAVs in static environments for traversing a specific pre-generated set of waypoints.

Building upon the NMPC planning algorithm introduced by Naegeli et al. (2017), we utilise an interior-point based algorithm from Domahidi and Jerez (2014) within a NMPC setting to solve a constrained optimal control problem that is subject to the constraints on the system dynamics and environment. Introduction of constraints in our method allows us to perform real-time obstacle avoidance while still providing guarantees on the trajectory feasibility. Thus, in this work we demonstrate the viability of real-time NMPC based unified motion planning and control for MAVPs in dynamic, multi-obstacle settings.

1.3 Paper organisation

We introduce preliminaries in Sect. 2 with our notations and system models. In Sect. 3 we describe our method for online and closed-loop, unified motion planning and control with NMPC. For the simulated and experimental studies performed, we outline our system setup and framework in Sect. 4. In Sects. 5 and 6 we discuss our findings followed by concluding remarks in Sect. 7.

2 Preliminaries

2.1 Notation

The following notations are observed; scalars x, vectors \(\varvec{x}\), matrices \(\varvec{X}\), sets \({\mathcal {X}}\), and reference frames \(\{X\}\). Time derivatives use dot accenting. Position vectors are denoted by \(\varvec{p}\in {\mathbb {R}}^3\). Unless otherwise stated, vectors are expressed in the East-North-Up (ENU) inertial frame \(\{I\}\). For vector \(\varvec{x}\in {\mathbb {R}}^n\) and positive semi-definite \(n{\times }n\) matrix \(\varvec{Q}\), the weighted squared norm is \(\left\| \varvec{x}\right\| _{\varvec{Q}}\buildrel \varDelta \over ={\varvec{x}^\top }{\varvec{Qx}}\). Rotations from frame \(\{A\}\) to \(\{B\}\) are denoted by matrix \(\varvec{R}_{A}^B{\in }SO(3)\) and basic axial rotations around x by \(\varvec{R}_x{\in }SO(3)\).

2.2 Quadrotor with swung payload model

The system comprises a quadrotor of mass \(m_q\) and a suspended point mass \(m_l\) attached by a l length cable from the quadrotor centroid. Let \(\varvec{p}_q\), \(\varvec{p}_l\) be the quadrotor, load position, and \(\varvec{r}_{ql}=\varvec{p}_l-\varvec{p}_q\). All reference frames are defined in Fig. 1. The load suspension angles \(\theta _l,\phi _l\) parametrise the orientation of \(\{L\}\) to \(\{S\}\). Intermediary frame \(\{S\}\) is used to avoid the singularity for a downward equilibrium load position when computing a rotation from \(\{L\}\) to \(\{E\}\) directly. Then let the MAVP configuration and its time derivative be given by variables

$$\begin{aligned} \varvec{q}&=\left[ \varvec{p}_q^\top ,~\theta _l,~\phi _l \right] \in {\mathbb {R}}^5 \\ \varvec{{\dot{q}}}&=\tfrac{d}{dt}\varvec{q}=\left[ \varvec{{\dot{p}}}_q^\top ,~{\dot{\theta }}_l,~{\dot{\phi }}_l \right] \in {\mathbb {R}}^5, \end{aligned}$$

and let \(\theta _q\), \(\phi _q\) be the true quadrotor pitch and roll with yaw remaining constant. The following additional model assumptions are adopted;

  • rigid, massless cable with free suspension points,

  • quadrotor centre of gravity, centroid, and cable suspension point coincide,

  • no aerodynamic drag effects on the cable.

A non-rigid cable would introduce switching dynamics increasing the complexity of the system modelling. A preliminary study on the cable rigidity during agile manoeuvres, as shown in “Appendix A”, shows the cable rarely slacks during flight allowing this assumption to be made. The second assumption regarding coincidence is made to simplify the model as the listed points tend to physically be in close proximity. Though the effects of the real point offsets on the system dynamics were not considered, for the purpose of performing predictions in the order of seconds, the effect was considered to be negligible. The assumption of no cable drag stems from the rationale that the string’s exposed surface area to the flow is small resulting in negligible effects on the system dynamics.

Fig. 1
figure 1

Quadrotor-Payload system with the following references frames; \(\{I\}\) inertial ENU, \(\{E\}\) vehicle-carried ENU, \(\{B\}\) vehicle body frame, \(\{S\}\) is \(\{E\}\) rotated by \(180^\circ \) about the \(\{E\}\)x-axis and \(\{L\}\) load frame with z-axis directed away from the cable’s suspension point. Quadrotor and load positions and relative suspensions angles indicated. Euler angles \(\phi _q,\theta _q\) parametrise frame \(\{B\}\) to \(\{E\}\); constant yaw assumed

We first describe the quadrotor’s input handling and the aerodynamic drag model. We then complete the model by derivation of the coupled quadrotor-load dynamics.

2.2.1 Quadrotor inputs

As in Klausen et al. (2015), we abstract quadrotor motor inputs and assume fast attitude and motor control such that by actuating the quadrotor’s pitch and roll, and setting a thrust command, we produce an inertial control force \(\varvec{F}_u\) in any desired direction for realising translational motion. Therefore, let the inputs be a desired quadrotor pitch (radians), roll (radians), and thrust command (meters/second) defined in \(\{E\}\) giving

$$\begin{aligned} \varvec{u} = \left[ {{{\bar{\theta }}}_q},~{{{\bar{\phi }}}_q},~{{{\bar{w}}}_q}\right] \in {\mathbb {R}}^{3}. \end{aligned}$$

This input choice is consistent with our chosen Parrot Bebop 2Footnote 1 quadrotor that internally controls motors based on inputs \(\varvec{u}\) to achieve full spatial flight; the internal controller is schematised in “Appendix B”. The input magnitude ranges are limited and hardware specific; these are accounted for in the design of the model predictive controller using constraints as further discussed in Sect. 3.3. We note that our method is not limited to the our chosen hardware and could easily be adapted to other quadrotors.

As the hardware-specific internal controller dynamics \(\varvec{u} \rightarrow \varvec{F}_u\) are not documented, we empirically model the function. We define \(F_q\) as the purely vertical control force generated by the four rotors when commanded by input \({{{\bar{w}}}_q}\). The quadrotor’s true pitch, roll response and the vertical control force \(F_q\) resulting from commanded inputs are given by

$$\begin{aligned} \left[ \theta _q,~\phi _q,~F_q\right] = \left[ h_{\theta }({{{\bar{\theta }}}_q}),~h_{\phi }({{{\bar{\phi }}}_q}),~h_{F}({{{\bar{w}}}_q})\right] \end{aligned}$$

where, using the method presented in Stanculeanu and Borangiu (2011), \(h_{\theta },~h_{\phi },~h_F\) are identified for the fast dynamics and decoupled as three linear second-order black-box models with model states and state transition

$$\begin{aligned} \varvec{x}_c&= \left[ x_{\theta ,1},~x_{\theta ,2},~x_{\phi ,1},~x_{\phi ,2},~x_{F,1},~x_{F,2}\right] \in {\mathbb {R}}^6 \nonumber \\ \varvec{\dot{x}}_c&= f_c(\varvec{x}_c, \varvec{u}). \end{aligned}$$

Note that with \(h_F\) we model \({{\bar{w}}}_q \rightarrow F_q\) directly as the internal vertical velocity stabiliser controls the vertical control force \(F_q\) (in \(\{E\}\)) generated by the motors based on the thrust command \({{\bar{w}}}_q\) (See “Appendix B”). Then similar to Naegeli et al. (2017), using outputs from (1) and based on equilibrium relations, the input control force is given by

$$\begin{aligned} \varvec{F}_u = \left[ m\tfrac{\tan (\theta _q)}{\cos (\phi _q)}g,~-m\tan (\phi _q)g,~{F_q + mg}\right] \in {\mathbb {R}}^3 \end{aligned}$$

where \(m=m_q+m_l\) and \(g=9.81\) m/s\(^2\). See “Appendix C” for a derivation of this relation.

The full-form of (1) identified for the Parrot Bebop 2 quadrotor is provided in “Appendix D”.

2.2.2 Aerodynamic drag effects

We derive the drag induced forces on the MAVP system; as in Derafa et al. (2006), assuming relatively low quadrotor velocities \(\varvec{{\dot{p}}}_q\) (up to 5 m/s) we model a proportional linear drag force on the quadrotor with drag constant \(k_{Dq}\) giving

$$\begin{aligned} \varvec{F}_{Dq} = k_{Dq}\varvec{{\dot{p}}}_q. \end{aligned}$$

As in Klausen et al. (2015), for the payload we only consider the rotational load motion relative to the quadrotor, hence, its drag force is assumed to always be perpendicular to the moment arm (the rigid cable). The approximation introduces swing associated damping due to the relatively large rotational payload velocities. This allows the load’s drag to be described in terms of the suspension angles and its derivative which are components of \(\varvec{q}\) and \(\varvec{{\dot{q}}}\). We also avoid defining the drag in terms of load velocity as this term is not a variable available in \(\varvec{{\dot{q}}}\). Additionally, following from our free suspension point assumption, there are no payload drag induced reactive forces or moments on the quadrotor. We note that prolonged linear translation of the system would make the linear drag contribution to the load dynamics significant as the load would trail behind the quadrotor; this could be included in future studies. Under these simplifications, the load’s signed quadratic drag force with drag constant \(k_{Dl}\) is given by

$$\begin{aligned} F_{Dl} = k_{Dl} v^2\tfrac{v}{\left| v\right| } \equiv k_{Dl} l^2 \omega ^2\tfrac{\omega }{\left| \omega \right| } \end{aligned}$$

where \(v = \omega l\) for circular motion with v, \(\omega \) the linear, angular load velocities and l the cable length. Substituting \(\omega \) in (5) by the load’s suspension angular rates and computing the induced moment at the suspension point we obtain

$$\begin{aligned} \left[ \tau _{\theta },~\tau _{\phi }\right] = k_{Dl} l^3 \left[ \omega _{\theta }^2\tfrac{\omega _{\theta }}{\left| \omega _{\theta }\right| },~\omega _{\phi }^2\tfrac{\omega _{\phi }}{\left| \omega _{\phi }\right| }\right] \end{aligned}$$

where \(\omega _{\theta }={\dot{\theta }}_l\), \(\omega _{\phi }={\dot{\phi }}_l\) and \(\tau _{\theta }\)\(\tau _{\phi }\) are the load’s drag force induced moments on the suspension angles \(\theta _l\)\(\phi _l\). With (4) and (6), the total exogenous system drag term is

$$\begin{aligned} \varvec{D}(\varvec{{\dot{q}}}) = \left[ \varvec{F}_{Dq}^\top ,~\tau _{\theta },~\tau _{\phi }\right] ^\top . \end{aligned}$$

2.2.3 System kinematics and dynamics

The MAVP Equations of Motion (EOMs) are derived in frame \(\{I\}\) according to Lagrangian mechanics. With frame transformations

$$\begin{aligned} \varvec{R}_{L}^S&= {\varvec{R}_y}(\phi _l){\varvec{R}_x}(\theta _l) \end{aligned}$$
$$\begin{aligned} \varvec{R}_{S}^E&= \varvec{R}_{x}(\pi ) , \end{aligned}$$

and \(\varvec{l}=\left[ 0,~0,~l\right] ^\top \) the rigid cable vector in \(\{L\}\), we define the load position as

$$\begin{aligned} \varvec{p}_l = \varvec{p}_q + \varvec{r}_{ql} = \varvec{p}_q + \varvec{R}_{S}^E \varvec{R}_{L}^S\varvec{l}. \end{aligned}$$

The payload velocity is then given by

$$\begin{aligned} \varvec{{\dot{p}}}_l = \tfrac{d}{dt}\varvec{p}_l = \varvec{{\dot{p}}}_q + \varvec{R}_{S}^E {{\varvec{{\dot{R}}}_{L}^S}} \varvec{l}. \end{aligned}$$

The Lagrangian in terms of the system kinetic and potential energies is

$$\begin{aligned} L = \underbrace{0.5{\left\| \varvec{\left[ \varvec{{\dot{p}}}_q,~\varvec{{\dot{p}}}_l\right] ^\top } \right\| _{\varvec{K}}}}_{\text {kinetic energy}} - \underbrace{g\left( m_q \varvec{{\dot{p}}}_q^\top \varvec{e}_3 + m_l \varvec{{\dot{p}}}_l^\top \varvec{e}_3\right) }_{\text {potential energy}} \end{aligned}$$

where \(\varvec{K}=\text {diag}(m_{q (1 \times 3)},~m_{l (1 \times 3)})\) and \(\varvec{e}_3=\left[ 0,0,1\right] ^\top \).

Using Lagrange’s equations according to D’Alembert’s principle, the non-linear EOMs describing the MAVP dynamics in compacted form are given by

$$\begin{aligned} \varvec{\ddot{q}} = \varvec{M}^{-1}(\varvec{q})\left( \varvec{F} - \varvec{D}(\varvec{{\dot{q}}}) - \varvec{C}(\varvec{q},\varvec{{\dot{q}}}) - \varvec{G}(\varvec{q}) \right) \end{aligned}$$

with force \(\varvec{F}=\left[ \varvec{F}_u,~0,~0\right] ^\top \in {\mathbb {R}}^5\), mass \(\varvec{M}\), drag \(\varvec{D}\) from (7), Coriolis \(\varvec{C}\) and gravitational \(\varvec{G}\) matrix terms. Equation (13) in its full form is presented in Klausen et al. (2015). Using (13), the system state and state transition are given by

$$\begin{aligned} \varvec{x}_q&= \left[ \varvec{q},~\varvec{{\dot{q}}}\right] \in {\mathbb {R}}^{10} \nonumber \\ \varvec{\dot{x}}_q&= \left[ \varvec{{\dot{q}}},~\varvec{\ddot{q}}\right] = f_q(\varvec{x}_q, \varvec{F_u}). \end{aligned}$$

2.2.4 Full MAVP model

Combining the quadrotor input and system model from (2) and (14), we denote the full MAVP state and state transition by

$$\begin{aligned} \varvec{x}&= \left[ \varvec{x}_c,~\varvec{x}_q \right] \in {\mathbb {R}}^{16} \nonumber \\ \varvec{\dot{x}}&= \left[ \varvec{\dot{x}}_c,~\varvec{\dot{x}}_q\right] = f(\varvec{x},\varvec{u}). \end{aligned}$$

Important MAVP model related variables and parameters that we often refer to are summarised in Table 1.

Table 1 MAVP system variables and parameters

2.3 Obstacle model

Obstacles with each position \(\varvec{p}_o\) are user-specified as cuboids and subsequently modelled by enclosing ellipsoids. Human obstacles are also specified as a cuboid of comparable size. Ellipsoids create smooth convex bounding volumes for (non-convex) obstacles making them appropriate for representing objects including trees, humans and pillars. Additionally, computationally efficient collision checks against the ellipsoid’s quadric exist (Uteshev and Goncharova 2018) making them favourable for real-time applications.

2.3.1 Obstacle ellipsoid definitions

Let the ellipsoid semi-principal axes \((a_o,b_o,c_o)\) be proportional to the specified cuboid dimensions \((u_o,v_o,w_o)\) such that there is ellipsoid surface contact at all cuboid corners, hence

$$\begin{aligned} \left( {{a_o},{b_o},{c_o}} \right) = {\tfrac{\sqrt{3}}{2}} \left( {{u_o},{v_o},{w_o}} \right) . \end{aligned}$$

We define two ellipsoids with buffers \(\beta \) as shown in Fig. 2;

  1. 1.

    the bounding ellipsoid\(S_{o}\) with dimensions \(\left( {a_o+\beta _o},{b_o+\beta _o},{c_o+\beta _o}\right) \) models the obstacle against which collisions are checked,

  2. 2.

    the expanded ellipsoid\(S_{e}\) with dimensions \(\left( {a_o+\beta _e},{b_o+\beta _e},{c_o+\beta _e}\right) \) represents a padding identified as a high risk zone used for planning safer trajectories.

Note by setting \(\beta \), a minimum cuboid to ellipsoid separation of \(\beta \) is warranted. Buffers \(\beta _o, \beta _e\) are used for collision-avoidance purposes as will become clear later.

2.3.2 Obstacle motion prediction

Static obstacle positions are assumed to be readily available for planning. As in Naegeli et al. (2017), we assume a constant velocity model for dynamic obstacles and predict their future positions based on a velocity estimate produced by a linear Kalman Filter using measured obstacle position data.

Fig. 2
figure 2

Cuboid obstacle (left) with fixed position \(\varvec{p}_o\) or dynamic (human) obstacle (right) with constant velocity \(\varvec{v}_o\) each modelled by bounding \(S_{o}\) and expanded \(S_{e}\) ellipsoid with dimensional buffers \(\beta \)

2.4 MAVP-obstacle collision avoidance requirements

Imperative to collision avoidance is ensuring separation between the MAVP and obstacles. By quantifying the quadrotor, load, and cable’s proximity to an obstacle we define mathematical requirements to guarantee a collision-free system.

2.4.1 Point to ellipsoid distance

The point to an ellipsoid signed distance is approximated as the true value cannot be expressed in closed form (Uteshev and Goncharova 2018). For a generic ellipsoid S with buffered dimensions \(\left( {{a_o+\beta },{b_o+\beta },{c_o+\beta }}\right) \) and position \(\varvec{p}_o\), the approximate signed distance to a point \(\varvec{p}\) based on the ellipsoid equation is

$$\begin{aligned} d_{o}(\varvec{p},S) = \left\| {\varvec{p}} - {\varvec{p}_o} \right\| _\varOmega - 1 \end{aligned}$$

where \({\varOmega } = \text {diag}(1/(a_o + \beta )^2,1/(b_o + \beta )^2,1/(c_o + \beta )^2)\).

When \(\varvec{p}\) is inside or on S, \(d_{o}\le 0\), and as \(\varvec{p}\) is further away from S, \(d_{o}\) increases from 0 to infinity.

2.4.2 Quadrotor and payload proximity

We model the quadrotor and payload individually by a bounding sphere with an associated radius \(r_c\). Without loss of generality, we assume an equal \(r_c\) for the quadrotor and payload. Consider the quadrotor; using the obstacle’s bounding ellipsoid \(S_o\) and setting \(\beta _o > r_c\) and \(\varvec{p} = \varvec{p}_q\), then using (16) we can guarantee the quadrotor does not collide with the cuboid shaped obstacle provided

$$\begin{aligned} d_{o}(\varvec{p}_q,~S_o)>0. \end{aligned}$$

Similarly, considering the payload associated bounding sphere and position \(\varvec{p}_l\) gives

$$\begin{aligned} d_{o}(\varvec{p}_l,~S_o)>0. \end{aligned}$$

2.4.3 Rigid cable proximity

Modelling the cable as a mobile finite line segment we identify the cable’s Closest Point of Approach (CPA) to \(S_o\) denoted by \(\varvec{p}_c^ * \); this is the cable’s most critical point for collisions simplifying the check as a point to ellipsoid computation. Given the cable cross-section dimensions are negligible, no buffer is required so \(\beta _o=0\). Using (16), \(\varvec{p}_c^ * \) is computed by

$$\begin{aligned} \varvec{p}_c^ * = \mathop {\arg \min }\limits _{{\varvec{p}_c}} \left( d_{o}(\varvec{p}_c,~S_o) \right) \end{aligned}$$

with \({\varvec{p}_c} \in \left\{ {\left. \varvec{p} \right| \varvec{p} = {\varvec{p}_q} + s({\varvec{p}_l} - {\varvec{p}_q}),s \in \left[ {0,1} \right] } \right\} \). “Appendix E” shows the problem (19) is expressible in closed-form and analytically solvable. Using (19) the cable is guaranteed to be collision-free with respect to the cuboid obstacle provided

$$\begin{aligned} d_{o}(\varvec{p}_c^*,~S_o) > 0. \end{aligned}$$

Requirements (17-18,20) must be satisfied with respect to each obstacle to guarantee a collision-free MAVP system.

3 Online and closed-loop MAVP trajectory generation

3.1 Method overview

The planning and control objective is to navigate the MAVP system from an initial position \(\varvec{p}_\text {start}\) to a user-definable goal position \(\varvec{p}_\text {goal}\) in a safe, agile, and collision-free manner. To accommodate for the dynamic environment, we perform dynamic and closed-loop local motion planning using NMPC which is a receding finite-horizon controller.

3.1.1 Receding horizon dynamic planning

Denote by \(\varDelta t\) the time-step, by k the stage index, and by N the finite planning horizon (number of stages). At every sampling instance t we generate a local trajectory of duration \(N\varDelta t\) encoded as a sequence of \(N+1\) states that includes the initial state \(\varvec{x}_0\), the transition states \(\varvec{x}_k\), and a terminal state \(\varvec{x}_N\) thus giving

$$\begin{aligned} \varvec{{\tilde{x}}}:= \left[ \varvec{x}_0,\ldots ,\varvec{x}_N\right] . \end{aligned}$$

For state realisation, the associated input sequence up to the terminal state is denoted by

$$\begin{aligned} \varvec{{\tilde{u}}}:= \left[ \varvec{u}_0,\ldots ,\varvec{u}_{N-1}\right] . \end{aligned}$$

Following execution of \(\varvec{u}_0\), the planning is receded by \(\varDelta t\) to \(t+\varDelta t\). At the next sampling instance the new obstacle positions and a new initial state estimate \(\varvec{x}_0\) are obtained. Subsequently, a local trajectory is re-generated by initialising the solver with a time-shifted version of the previous solution. The time-shift in simulation studies is a fixed simulation time step, and in real experiments the actual control loop time is used. This approach results in our method’s computationally efficient closed-loop performance with robustness to model uncertainties (Naegeli et al. 2017); we illustrate this process in Fig. 3.

Fig. 3
figure 3

System moves towards \(\varvec{p}_\text {goal}\) with \(t_2>t_1\); planned local trajectory (grey) at the current time (left) that is updated in a future time (right) with the new state and obstacle data. Schematic projected top view with illustrative obstacle ellipsoids shown

3.1.2 Local trajectory generation

At every sampling instance we solve a constrained optimisation problem. The designer encodes the desired planning objectives in an objective function using costs to quantify the generated trajectory’s performance. The costs are designed to lower with an increasing satisfaction of the objective. For every trajectory stage k, we evaluate an associated scalar cost giving a cost sequence

$$\begin{aligned} \varvec{{\tilde{c}}} := \left[ c_0,\ldots ,c_N\right] . \end{aligned}$$

Within (23), the trajectory stage costs are given by

$$\begin{aligned} c_k = c_s(\varvec{x}_k,\varvec{u}_k,*_k), k \in \left[ 0,N-1\right] \end{aligned}$$

where function \(c_s\) is evaluated on the predicted state, input, and any online environment variables (obstacle positions, navigation goal, slacks etc.) that we denote by \(*\). The terminal cost is given by

$$\begin{aligned} c_N = c_t(\varvec{x}_N,*_N) \end{aligned}$$

where function \(c_t\) is evaluated on the variables of the terminal stage N. Terminal costs are used to achieve closed-loop stability of the finite-horizon planner (Mayne et al. 2000).

We then quantify the full trajectory’s performance by the objective function defined as

$$\begin{aligned} J = \displaystyle \sum \nolimits _{k = 0}^{N} {c}_k. \end{aligned}$$

Constraints are introduced to limit the solution space for the trajectory which is encoded in \(\varvec{{{\tilde{x}}}}\) and \(\varvec{{{\tilde{u}}}}\) thus providing (feasibility) guarantees for the computed trajectory. To ensure the optimiser always returns a solution at run-time, we may tolerate minor constraint violations by introducing non-negative slack variables that soften the constraint (Zheng and Morari 1995). We encode the slack variables associated with the trajectory in

$$\begin{aligned} \varvec{{\tilde{s}}}:= \left[ \varvec{s}_0,\ldots ,\varvec{s}_N\right] . \end{aligned}$$

A planning violation occurs when the optimiser produces positive entries of \(\varvec{{\tilde{s}}}\). A physical violation only occurs when the real system breaches constraints, i.e., the current slack \(\varvec{s}_0\) of \(\varvec{{{\tilde{s}}}}\) is positive. By associating a high slack related cost in the optimisation objective function, we avoid positive entries of \(\varvec{{\tilde{s}}}\) and accordingly any planning and physical violations (Zheng and Morari 1995).

During optimal trajectory generation we minimise (26) while respecting the constraints resulting in a \(N{\varDelta }t\) length locally feasible trajectory. In subsequent sections we introduce the costs and constraints after which we formalise the optimisation algorithm in Sect. 3.4.

3.2 Costs

We introduce cost terms derived from our planning objectives presented in Sect. 3.1. We use our weighted square norm definition from Sect. 2.1 with an \(n \times n\) identity matrix denoted by \(\varvec{I}_n\) to make all cost terms scalar and positive.

3.2.1 Point-to-point navigation

For navigation we minimise the displacement between the quadrotor position and goal \(\varvec{p}_\text {goal}\). Let \(\varvec{p}_\text {start}\) be the start position, then we normalise the cost to treat all start to goal distances equally. The cost term is given by

$$\begin{aligned} {c_{\text {nav}}}= \frac{{{{\left\| {{\varvec{p}_{{\text {goal}}}} - {\varvec{p}_q}} \right\| }_{\varvec{I}_3}}}}{{{{\left\| {{\varvec{p}_{{\text {goal}}}} - {\varvec{p}_{{\text {start}}}}} \right\| _{{\varvec{I}_3}}}}}}. \end{aligned}$$

Making (28) a stage cost would mean the shortest path (straight line) is always preferred which may result in deadlock for cases where it is necessary to go around an obtrusive obstacle. Therefore, we use (28) only as a terminal cost thus allowing curved paths to be generated such that locally and terminally the system reaches a more favourable position.

3.2.2 Potential field based obstacle separation

For obstacle separation, we employ a MAVP to obstacle proximity related cost analogous to a reactive potential field (Khatib 1986). We combine this with constraints to guarantee collision-free trajectories as is presented in Sect. 3.3.3. This two layered approach, similar to Naegeli et al. (2017), enhances the operational safety by pro-actively reducing the collision risk especially for unmodelled system and obstacle dynamics.

Let \(\varvec{p}\) be the quadrotor, load, or cable’s CPA position [see (19)]; for each object we compute a cost. Let \(\varvec{p}_o\) be the obstacle’s predicted position, then the potential cost term activates when \(\varvec{p}\) is in the obstacle’s expanded ellipsoid \(S_e\), i.e., using (16), \(d_{o}(\varvec{p},~S_e) < 0\). We choose the \(S_e\) associated buffer \(\beta _e\) such that \(\beta _e \gg \beta _o\). Observing that \(\left| d_{o}(\varvec{p},~S_e)\right| \) increases from zero to one as point \(\varvec{p}\) moves from the ellipsoid surface towards its centre, by penalising a \(\varvec{p}\) more towards the centre, we naturally discourage \(\varvec{p}\) from getting closer to the smaller bounding ellipsoid \(S_o\). Given \(S_o\) models the actual obstacle, using this method we promote separation from the obstacle. With this insight, and using (16), the cost is formalised as

$$\begin{aligned} {c_\text {pf}}= \left\{ {\begin{array}{ll} {{{\left\| d_{o}(\varvec{p},S_e) \right\| }_{{\varvec{I}_1}}}},&{}\quad \text {if } d_{o}(\varvec{p},S_e) < 0 ,\\ 0,&{}\quad \text {otherwise}. \end{array}} \right. \end{aligned}$$

3.2.3 Input magnitude regulation

The input magnitude associated cost is given by

$$\begin{aligned} {c_\text {in}} = {\left\| \varvec{u}^\top \right\| _{{\varvec{I}_3}}}. \end{aligned}$$

For our agile manoeuvres, we weigh this cost low. Associating a high cost will improve the system’s energy-efficiency by the conservative use of large inputs.

3.2.4 Payload suspension angles regulation

The suspension angle associated cost is given by

$$\begin{aligned} {c_\text {swing}} = {\left\| \left[ \theta _l,\phi _l \right] ^\top \right\| _{{\varvec{I}_2}}}. \end{aligned}$$

For our agile manoeuvres, we weigh this cost low. Associating a high cost will minimise the swing angles if desired.

3.3 Constraints

We derive planning constraints from our system and setup limits in addition to the global planning objectives.

3.3.1 MAVP dynamics

The process model state transition given by (15) is discretised and enforced on the trajectory state evolution by an inter-stage equality constraint

$$\begin{aligned} \varvec{x}_{k+1} = f(\varvec{x}_k,\varvec{u}_k) \end{aligned}$$

where k is the stage index.

3.3.2 State and input limits

The state and input values are bound to the system allowable ranges. Let \({\mathcal {X}}_\text {min}\), \({\mathcal {X}}_\text {max}\) and \({\mathcal {U}}_\text {min}\), \({\mathcal {U}}_\text {max}\) denote the state and input range limits, then the following inequalities must be satisfied

$$\begin{aligned} {\mathcal {X}}_\text {min} \le \varvec{x}&\le {\mathcal {X}}_\text {max} \end{aligned}$$
$$\begin{aligned} {\mathcal {U}}_\text {min} \le \varvec{u}&\le {\mathcal {U}}_\text {max}. \end{aligned}$$

We specify the hardware-specific limits in Section 4.

3.3.3 Collision-free planning

Collision-free trajectory planning is guaranteed by constraining the allowable system’s spatial states. Let \(\varvec{p}\) be the quadrotor, load, or cable’s CPA position [see (19)]; for each we define a constraint. Adopting the requirements (17-18, 20) for a collision-free MAVP system as presented in Sect. 2.4, and using (16), the associated constraint is formalised as

$$\begin{aligned} {d_{o}(\varvec{p},S_o)} + {s_c} > 0 \end{aligned}$$

with the non-negative scalar slack \(s_c\). Note that even though (35) is defined for every obstacle, only one slack \(s_c\) is defined and used in all obstacle associated inequality constraints per timestep in our optimisation problem definition. When any \({d_{o}(\varvec{p},S_o)}\) becomes negative, \(s_c\) assumes the highest value such that all obstacle associated inequality constraints are satisfied according to (35) for a specific timestep. Ideally \(s_c\) is zero hence the system is collision-free with respect to all obstacles. Provided a sufficiently high penalty cost associated with the slack, the feasible solution is recovered as shown in Kerrigan and Maciejowski (2000). Summarising, only one slack is necessary to differentiate between a system in collision or collision-free state with no necessary differentiation with respect to which obstacle that occurs.

3.3.4 Workspace limits

For confined (indoor) operation, the quadrotor and load position is limited to the workspace limits. Assume a cuboid workspace, then let \({\mathcal {W}}_\text {min}\), \({\mathcal {W}}_\text {max}\) denote the minimum and maximum workspace coordinates in frame \(\{I\}\), and between which the cuboid’s space diagonal is defined, then the following inequalities must be satisfied

$$\begin{aligned} \varvec{p}_q + \varvec{1}_{3}s_q&\ge {\mathcal {W}}_\text {min} \quad \text {and} \quad \varvec{p}_q - \varvec{1}_{3}s_q \le {\mathcal {W}}_\text {max} \end{aligned}$$
$$\begin{aligned} \varvec{p}_l + \varvec{1}_{3}s_l&\ge {\mathcal {W}}_\text {min} \quad \text {and} \quad \varvec{p}_l - \varvec{1}_{3}s_l \le {\mathcal {W}}_\text {max} \end{aligned}$$

with \(\varvec{1}_{3}=\left[ 1,1,1\right] ^\top \) and the non-negative scalar slacks \(s_q\), \(s_l\). When a constraint violation occurs, the slacks assume the highest value required to satisfy the associated workspace inequalities. Under workspace convexity, we also guarantee the rigid cable remains inside \({\mathcal {W}}\). Note that the inequalities are written in short form, however, for implementation each vector dimension would each have an individually defined inequality.

3.3.5 Scalability to large obstacle rich workspaces

We set a maximum omnidirectional obstacle detection range originating from the quadrotor position whereby we disregard any obstacles beyond the range for planning purposes. Therefore, the previously introduced obstacle related costs and constraints are dynamically implemented.

3.4 Optimisation algorithm

Local trajectory generation is formulated as a constrained optimisation problem subject to the following costs and constraint definitions;

3.4.1 Costs

We define the stage and terminal cost functions based on the cost term definitions (28296530-31). Let w denote a user-definable weighting used to assign relative importance to costs and their associated objective, then the stage cost function, as introduced in (24), is defined as

$$\begin{aligned} \begin{aligned} {c_s}(\varvec{x}_k,\varvec{u}_k,*_k)&= w_\text {in}c_\text {in} + w_\text {swing}c_\text {swing} \\&\quad +\, \varvec{w}_\text {pf}^\top \varvec{c}_\text {pf} + \varvec{w}_\text {slack}^\top \varvec{s}, k \in \left[ 0,N-1\right] \end{aligned} \end{aligned}$$

where \(\varvec{w}_\text {pf}\) and \(\varvec{c}_\text {pf}\) are vectors of weights and costs equal in size to the number of obstacles, and \(\varvec{w}_\text {slack}^\top \varvec{s}\) all the slacks associated cost where \(\varvec{s}=\left[ s_c,s_q,s_l\right] \in {\mathbb {R}}^3\). The terminal cost function, as introduced in (25) is defined as

$$\begin{aligned} \begin{aligned} c_t(\varvec{x}_N,*_N)&= w_\text {nav}c_\text {nav} + w_\text {in}c_\text {in} + w_\text {swing}c_\text {swing} \\&\quad +\, \varvec{w}_\text {pf}^\top \varvec{c}_\text {pf} + \varvec{w}_\text {slack}^\top \varvec{s}. \end{aligned} \end{aligned}$$

3.4.2 Constraints

We impose the system dynamics, and state/input constraints, as introduced in Sect. 3.3, on the optimiser. By function \(\varvec{g}(\varvec{x}_k,\varvec{u}_k,*_k)\) we denote all additional inequality constraints as defined in (35-37) which are functions of state, input and additional (time specific) variables/parameters, e.g., obstacle positions and workspace limits.

Combining our previously introduced trajectory variables (212227), we denote the optimisation variable by

$$\begin{aligned} \begin{aligned} \varvec{{\tilde{z}}}&=\left[ \varvec{x}_0,\ldots ,\varvec{x}_N,\varvec{u}_0, \ldots ,\varvec{u}_{N-1},\varvec{s}_0,\dots ,\varvec{s}_N\right] \\&\equiv \left[ \varvec{{\tilde{x}}},\varvec{{\tilde{u}}},\varvec{{\tilde{s}}}\right] . \end{aligned} \end{aligned}$$

With the estimated initial state \(\varvec{x}_0\), we optimise \(\varvec{{{\tilde{z}}}}\) such that the objective function (26) is minimised resulting in a locally optimal and feasible trajectory. With costs and constraints stacked together over all stages and obstacles, the optimisation problem that is solved at every planning time instance t is formally defined as

$$\begin{aligned} \begin{aligned} \mathop {\min }\limits _{\varvec{{\tilde{z}}}}&\quad J = {c_t}({\varvec{x}_{N}},{*_{N}}) + \mathop \Sigma \nolimits _{k = 0}^{N - 1} {{c_s}({\varvec{x}_{k}},{\varvec{u}_{k}},{*_{k}})} \\ \text {s.t.}&\quad \varvec{x}_0 = \varvec{x}(t)&\text {(Initial Estimated State)}\\&\quad \varvec{x}_{k+1} = f(\varvec{x}_{k},\varvec{u}_{k})&\text {(Discretised Dynamics)}\\&\quad g(\varvec{x}_{k},\varvec{u}_k,*_k) \ge 0&\text {(Inequality Constraints)}\\&\quad \varvec{x}_{k} \in {\mathcal {X}}&\text {(State Constraints)}\\&\quad \varvec{u}_{k} \in {\mathcal {U}}&\text {(Input Constraints)}\\&\quad \varvec{s}_{k} \ge 0&\text {(Slack Constraints).} \end{aligned} \end{aligned}$$

3.5 Theoretical analysis

3.5.1 Problem dimensionality

Variable \(\varvec{{{\tilde{z}}}}\) is optimised at every planning time instance encoding the optimised local trajectory in its solution. As given by (40), \(\varvec{{{\tilde{z}}}}\) comprises a sequence of \(N+1\) states \(\varvec{x}\in {\mathbb {R}}^{16}\), N inputs \(\varvec{u}\in {\mathbb {R}}^3\) and \(N+1\) slacks \(\varvec{s}\in {\mathbb {R}}^3\), hence \(\varvec{{{\tilde{z}}}}\in {\mathbb {R}}^{22N+19}\).

3.5.2 Optimality and feasibility

We use a fast non-linear programming based optimiser, namely FORCES Pro (Zanelli et al. 2017), on our non-convex optimal control problem. FORCES Pro implements a primal-dual interior-point constrained optimisation algorithm which is outlined in “Appendix F”. Consequently, the computed trajectories are only locally optimal over the planning horizon N with the possibility of deadlocks when the planned trajectory converges to any local optima in the solution space. When this occurs, our planner holds the MAVP at this locally optimal state until a solution becomes available as a result of the changing environment or external perturbation to the system.

Planning feasibility is warranted over the full N stages when all optimised slacks \(\varvec{{{\tilde{s}}}}\) are zero. When full planning feasibility is not realised, provided that at least the current slack \(\varvec{s}_0\) is zero, the current system state and inputs will be feasible. Re-planning at a future instance can re-establish full planning feasibility.

A comprehensive overview of the optimality and stability of (N)MPC algorithms is available in Mayne et al. (2000).

4 System setup and framework

We outline our particular implementation of the system model and NMPC controller for simulated and experimental studies, including a state estimator.

4.1 System properties and hardware

The MAVP system properties used for all studies are given in Table 2. The maximum pitch, roll and thrust command inputs are derived from quadrotor’s system limits as used in this study. The MAVP hardware is shown in Fig. 4.

Table 2 MAVP system properties as used for study
Fig. 4
figure 4

Parrot Bebop 2 quadrotor with cable suspended load and attached tracking markers

4.2 Workspace

We perform studies in a simulated and real workspace measuring \(6.0 \times 3.0 \times 2.6\,\hbox {m}\) (L\(\times \)W\(\times \)H). The real indoor workspace has an OptiTrackFootnote 2 Motion Capturing System (MCS) that can track markers for obtaining rigid-body pose measurements in SE(3) at around 120 Hz.

4.3 Programmed control system framework

The control framework is schematised in Fig. 5; in “Appendix B” the on-board control component is expanded upon. The off-board components run on an Intel i7-3610QM Quad Core 3.3GHz processor PC, and is programmed in MATLAB with an efficient C language solver FORCES Pro performing the online NMPC computations (Zanelli et al. 2017). All studies are performed on the same computer with at maximum one core being utilised by FORCES Pro at run-time. The on-board components run on the MAVP hardware; for simulated studies we replicate this with our system model. In experiments, communication between hardware is performed over a ROS based network. As it is shown in Sect. 5, the controller loop time was on median in the order of \(10^{-2}\,\hbox {s}\), therefore, for simulation and controller design, explicit Runge–Kutta \(2^\text {nd}\) order integration of the system model is used. Runge–Kutta \(2^\text {nd}\) was chosen as a preliminary examination of the predicted system response showed it provided a good balance between loop time performance and dynamics approximation. An in-depth comparison of other discretisation methods is beyond the scope of this paper and should be investigate in future studies. The implemented NMPC cost weights are given in Table 3.

Fig. 5
figure 5

Control system framework with the off-board NMPC controller and state estimator sharing a MAVP model, and the on-board MAVP system. Quadrotor input controller performs closed-loop low-level control. External motion capturing produces measurements necessary for state estimation

Table 3 Implemented default cost term weights

4.4 Cascaded Kalman filter state estimator

Kalman Filtering (KF) based state estimation of \(\varvec{x}\) is performed using the system process model, and MCS based sub-millimetre measurements of the quadrotor and load’s pose in SE(3) (Point 2009). As the Parrot Bebop’s standard interface lacks a high frequency output of internal sensor measures to off-board clients, they were unusable for our state estimation routine. The MAVP system for which we present process models in Sect. 2.2 comprises (i) a quadrotor specific input controller, and (ii) the general non-linear MAVP system. We distribute the state estimation over two KFs permitting individual treatment of the subsystems and maintaining modularity.

A Linear KF is used to estimate state \(\varvec{x}_c\) of the quadrotor input model (2). The MCS provides measures of real pitch \(\theta _q\) and roll \(\phi _q\) for estimating states \(x_{\theta _1},x_{\theta _2},x_{\phi _1},x_{\phi _2}\). Lacking necessary measurements of the vertical control force \(F_q\), states \(x_{F,1},x_{F,2}\) are only predicted without performing the KF measurement update step. Similar to Bisgaard et al. (2007), a non-linear Unscented KF is used to estimate the MAVP states \(\varvec{x}_q\) of the system model (14). Measures for state variable \(\varvec{q}\) are directly reconstructed from MCS data using the kinematic relations introduced in Sect. 2.2.3. Using the process model, the Unscented KF is primarily tuned to provide noise reduced estimates of the time derivatives of \(\varvec{q}\) in \(\varvec{x}_q\). The Unscented KF directly uses the non-linear and observable process model for state prediction without performing linear approximations as traditionally required by the Extended KF thus usually improving the prediction accuracy (Julier and Uhlmann 1997). The full cascaded KF based estimator design is schematised in Fig. 6.

Fig. 6
figure 6

Cascaded state estimation using Linear and Unscented KF and measured MCS data. Linear KF estimates \(\varvec{x}_c\) based on the quadrotor input model, measured true pitch, roll and inputs. Unscented KF estimates \(\varvec{x}_q\) based on MAVP model, measured MAVP configuration \(\varvec{q}\) and control force inputs computed by (3) using the Linear KF outputs

5 Simulation study

We showcase our method’s scalability, robustness and performance in simulated studies. This is presented as a precursor towards our experimental results involving flight amongst multiple human obstacles and acrobatic manoeuvers.

The following metrics are used; let the system’s distance-to-goal be defined as

$$\begin{aligned} d_\text {goal} = \left| \varvec{p}_q - \varvec{p}_\text {goal}\right| , \end{aligned}$$

then the time-to-goal is the elapsed run-time such that \(d_\text {goal}\) strictly remains below 0.2 m.

Fig. 7
figure 7

Simulated NMPC solve time, time-to-goal and physical violations (collisions or breach of workspace limits) per run with increasing planning stages and 16 runs per case

Fig. 8
figure 8

Point-to-point navigation with collinear dynamic obstacle; showing planned and executed trajectory and current quadrotor position. Top down view

5.1 Scalability of the optimisation problem

The scalability of the optimisation problem is studied against the number of planning stages and separately the number of obstacles.

5.1.1 Scaling with number of planning stages

The quadrotor, with a randomised initial swing \(\theta _l\), \(\phi _l\)\(<10^\circ \) starts at \((-2.0,0.0,1.1)\) with the dynamic obstacle at (2.0, 0.0). A collinear position swap is performed with the obstacle moving at 0.5 m/s such that the head-on paths critically tests the predictive planning behaviour. The number of planning stages N is increased from 10 by 4 to 26. Using \(\varDelta t=0.05\) s, default cost weights we perform 16 runs per case.

Results in Fig. 7 show the scaling of the NMPC solve time with N; it shows a positive correlation which is expected as the optimisation variable, given by \({{\tilde{z}}}\in {\mathbb {R}}^{22N+19}\), increases the problem dimension with a larger N. As shown in Fig. 8a where N=10 the system responds late to the incoming obstacle leading to a near-miss or collision (physical violation). The late response means the attempted aggressive evasive behaviour causes the system to move far off-track, sometimes leading to workspace limit violations, and overall increasing the time-to-goal. As depicted in Fig. 8b, with a higher N the collisions are averted and the system responds in a smooth agile motion. However, with \(N=26\), the planner favours a greater MAVP to obstacle separation over each generated trajectory resulting in a lower potential field associated cost of the objective function. As a result of the greater separation, the total distance covered from the start to goal is greater thus increasing time-to-goal. Important to note is that the generated planning remains tunable through modification of the cost function weights resulting in different system behaviours.

Based on the results and qualitative observations, \(N=18\) was used for all subsequent studies as it balances run-time and planning performance. To address the pitfalls of using a low N, a novel assistive steering approach can be used that guides the planned trajectory away from obstacles even when the obstacle is not within the prediction horizon; this method is outlined in “Appendix G”. The results and benefits of assistive steering is discussed in “Appendix H”. For short prediction horizons which are low N, the assistive steering aids the planning performance for obstacle avoidance. For N higher than 14 and the system speeds achieved in our experiments, the planner’s lengthened predictive horizon makes the assistive steering redundant as no benefit are apparent. Therefore, in the experiments discussed in this paper, we do not employ assistive steering.

Fig. 9
figure 9

Simulated NMPC solve time and time-to-goal with increasing number of dynamic obstacles using \(N=18\) and 16 runs per case. No violations/collisions occurred. Outlier for time-to-goal at 6 obstacles is for a run that temporarily entered deadlock resulting in a longer path

Fig. 10
figure 10

Point-to-point navigation using N=18 amongst 8 randomised dynamic obstacles moving at \(\le 1\) m/s. The dynamically planned and agile executed trajectories of the quadrotor and payload are shown with the current quadrotor position indicated. Top down view

5.1.2 Scaling with number of dynamic obstacles

We perform a navigation task from \((-2.5,-1.0,1.0)\) to (2.5, 1.0, 1.0) amongst \(n_o\) randomly placed obstacles with randomised velocities \(\le \)1 m/s. We increase \(n_o\) from 2 by 2 to 8 with \(\varDelta t=0.05\) s, N=18, default cost weights and perform 16 runs per case.

Results in Fig. 9 indicate a positive trend in MPC solve time with \(n_o\) resulting from the additional cost and constraints introduced into the optimisation problem per additional obstacle. The time-to-goal shows an increasing spread with \(n_o\) as the obstacles are more likely to obstruct the system’s most direct path to the goal thus requiring a re-route resulting in a lengthier route. In Fig. 10 we show one run demonstrating the MAVP’s agile response amongst 8 dynamic obstacles. The outlier at six obstacles is the result of a temporary deadlock situation that is resolved by a lengthier planned route. As mentioned, NMPC is locally optimal, therefore, the deadlock situation arises from a local minimum of the objective function that occurs when several obstacles corner or obstruct the MAVP’s path. In those cases, the planning may not be able to detour around the obstruction as the objective function over the planning length may only have a positive gradient. This local optimality is a limiting characteristic of local planning algorithms (LaValle 2011). However, a benefit of our local planning method is that the system holds its position during deadlock and continues identifying solutions in the evolving solution space (due to dynamic obstacles). Therefore, in most cases it is capable of self-resolving the deadlock given sufficient time.

Table 4 Comparison of NMPC to pre-generated and minimal swing approach for mean off-line computation, trajectory execution (time-to-goal) and total task completion time over 4 repeated runs
Fig. 11
figure 11

Comparison of executed trajectories for manoeuvring around an obstacle (simple) and completing a slalom course (difficult) using NMPC with N=18, pre-generated and minimal swing planning and control. Observe that pre-generation leads to the smoothest, most optimal path resulting from the global planning scope. NMPC response resembles pre-generation and only initially reacts later to the presence of obstacles due to local planning. Minimal-swing response is sluggish as the turning motions are more suited for agile behaviour. Top down view with \(t_3>t_2>t_1>t_0\) shown

5.2 Performance comparison to contemporary approaches

We compare the total task completion time for three methods; (i) our NMPC (ii) pre-generated and (iii) minimal swing trajectory planning and control. A navigation task is performed for a simple static obstacle and a difficult slalom setup. For (i) we use \(N=18\) and default cost weights, for (ii) we use our optimiser with \(N=200\) for sufficient stages to pre-plan the entire trajectory and then simply track it, and for (iii) we use \(N=18\) with a high swing cost \(w_\text {swing}=1\). We use \(\varDelta t=0.05\) and perform 4 repeated runs. Table 4 shows a comparison of the total task completion times (off-line computation and trajectory execution), and Fig. 11 depicts the executed trajectories using the three approaches. As expected, the pre-generated trajectory has the shortest time-to-goal for both tasks due to its highly optimised planning which requires large off-line computation times. The minimal swing approach results in sharp turns as the system accelerates and decelerates at the turning points making the motion slow and space inefficient as substantial effort is required to maintain a low swing angle through the turns. The NMPC based trajectory is marginally slower and less optimal than pre-generation, however, direct deployability means the simple task is completed within 2.65 s, a \(48\%\) reduction, and the difficult task within 5.35 s, a sizeable \(65\%\) reduction compared to pre-generation. Unlike pre-generation where a task-specific trajectory is generated, our NMPC method adapts to both tasks without any reconfiguration. With increasing task complexity and duration, greater reductions can be realised making NMPC’s scalability unparalleled. Furthermore, our NMPC method applies to dynamic scenarios.

Fig. 12
figure 12

Distance-to-goal for a simple point-to-point navigation task, load suspension angles and NMPC generated pitch and roll inputs with \(N=18\) and an increased NMPC time-step from \(\varDelta t\)=0.05 s to 0.20 s and control to execution time lag of 0.10 s. In (b), observe that NMPC compensates for the long time-step over which inputs are executed by reducing the input magnitudes resulting in a stable yet slower, less agile motion. Comparing (a) and (c), observe that even with a high time lag, the MAVP responds in a stable and agile manner

5.3 Robustness to change in control time step and lags

We demonstrate the robustness of our method by (i) increasing \(\varDelta t\) from 0.05 s to 0.20 s to simulate a slower NMPC controller (on a less-powerful computer), and (ii) artificially adding a 0.1 s lag between NMPC generated input commands and actually executing them. We use the simple task from Sect. 5.2 for analysis and \(N=18\) and default cost weights. Given that N is a pre-configured design value of the MPC controller, and that \(\varDelta t\) depends on the real loop-time, this study aims to show the affects of different \(\varDelta t\) resulting from the use of hardware with varying computationally capability. Therefore, it is important to note that the lookahead horizon \(N \varDelta t\) is different in each case.

Comparing Fig. 12a, b with the different \(\varDelta t\), notice that NMPC automatically adjusts and reduces the computed input magnitudes for the longer time step resulting in a slower, less agile system; this is apparent from the distance-to-goal and load angles plots. With \(\varDelta t\)=0.20 s, agile manoeuvres are inconceivable as large inputs over the long time-step would result in excessive accelerations with detrimental consequences on overall performance. Using the process model, NMPC is able to appropriately adapt its planning and control to the time-step size to realise the desired motion.

With a 0.10 s lag, notice in Fig. 12c that the system’s distance-to-goal and load angles are similar to those with no lag in Fig. 12a. Due to our method’s closed-loop setup, the true system behaviour is continually used to re-initialise the planning instance thus modelling errors do not accumulate. If pre-generated trajectories were used, any unaccounted lag would result in significant deviations of the real system from the planned path due to model mismatch. NMPC is therefore more robust to small modelling inaccuracies making it a safer and more practical method for real-world applications.

Increasing \(\varDelta t\) further to 0.25 s and lag to 0.15 s destabilises the NMPC controller in simulation. We attribute this to several causes; first large time-steps used in combination with NMPC’s discretised process model can result in prediction error divergence. Second, unmodelled time lags result in the prolonged execution of the large magnitude inputs required for agile flight resulting in excessive, destabilising accelerations; for short lags, the closed-loop control is able to prevent this from occurring. By acknowledging the presence of a long time-step and/or lag in the controller design, the method’s prediction accuracy can be improved; this is future work.

In the simulation study we were able to select \(\varDelta t\) for a pre-configured N and show its effects on the system dynamics approximation and system performance. In the experimental setup the \(\varDelta t\) is equal to the real control loop-time which positively correlates to N as shown in Sect. 5.1. Therefore, in experimental studies it is important to tune N such that the obtained \(\varDelta t\) is compatible with the discretisation method used on the system dynamics.

6 Experimental study

We showcase complex, agile behaviour in static and dynamic experimental setups. First we analyse the effects of prediction accuracy on our planner’s performance as it is integral to our discussion of the experimental results. The same distance/time-to-goal definitions as introduced in Sect. 5 are used. Videos of the experiments performed are at https://youtu.be/9C7O34W1w8Y.

6.1 Planning prediction accuracy over prediction horizon

To demonstrate the effect of our local planner’s prediction accuracy on system performance, we replicated the simulation as described in Sect. 5.1.1 in a real-world setup. Six identical runs were performed with the NMPC planner configured to use the real time-step, \(N=18\) and default cost weights.

As our method is online and local, closed-loop trajectory planning is achieved every cycle by using the estimated system and dynamic obstacle states, and a plant model for propagating these states over N planning stages. For each generated local trajectory \(\varvec{{\tilde{x}}}\), we compute the absolute error between the predicted and future true measured quadrotor/payload position for all N stages. With \(\varDelta t\) our time-step, for the k-th stage we compare the predicted position to the true measurements obtained for the system \(k\varDelta t\) in the future.

As shown in Fig. 13, our planner’s prediction error of the quadrotor and payload position are in the order of magnitude \(10^{-2}\,\hbox {m}\) up to 3 stages. As stated in Sect. 1, each local plan must remain feasible during execution for a full computation cycle, therefore, it is important that these first stages are accurately predicted. During one computation cycle of \(\varDelta t\), the system translates to the first predicted state of our plan before we re-plan, therefore, it is critical that this prediction is accurate to guarantee feasibility during execution. We demonstrate with metrics in Fig. 13 that our planner performs as desired for this critical prediction horizon.

Fig. 13
figure 13

Absolute position error for the 18 predicted stages of quadrotor and payload position when compared to measured true values for a point-to-point navigation task with a collinear dynamic obstacle

Fig. 14
figure 14

MAVP’s agile acrobatic manoeuvre over a high bar and through a narrow opening. Snapshot of one pass shown (image). The quadrotor and payload clearance to obstacles (grey with pink enclosing ellipsoid) at the vertical obstacle plane (\(x=0\,\hbox {m}\)) over three successive passes (plot) is shown and numbered 1,2,3 and 1\('\),2\('\),3\('\) for each pass. Observe the agile motion and that sufficient clearance is maintained over all passes

Furthermore, as NMPC utilises plant model propagation over the N stage prediction we observe that the accumulation of inter-stage errors partially contributes to the reduction of prediction accuracy at the higher stage counts. Furthermore, with our local planning approach in a dynamic environment, planned trajectories can significantly differ from cycle-to-cycle as new routes become feasible at run-time. This can skew prediction error results as a predicted trajectory may never actually be executed. These two factors results in an increase of absolute position error with number of prediction stages as seen in Fig. 13. We refer to this prediction accuracy analysis in the discussion of our experimental results to support explanation of observed system behaviour.

6.2 Agile acrobatic manoeuvres

Two complex agile manoeuvres are performed; (i) the MAVP must fly over a high bar at 0.95 m with a virtual ceiling of 1.8 m, and (ii) similar to De Crousaz et al. (2014) and Tang and Kumar (2015) , the MAVP must fly through a narrow \(0.7 \times 0.7\,\hbox {m}\) opening. For both manoeuvres, three passes over/through the obstacle are performed in a rapid, successive and bidirectional manner. The tasks are impossible to execute without reducing the system’s total vertical dimension (0.9 m when stationary) by swinging the load. The NMPC uses the real time-step, \(N=18\) and default cost weights. For the narrow opening, the maximum pitch/roll input is increased to \(20^\circ \).

In Fig. 14 the two agile manoeuvres and the obstacle to MAVP clearance over all passes is shown. As the planning must excite the load’s swing over a relatively short distance, large rapid inputs are commanded. Following the manoeuvre, the controller is able to stabilise the system at the goal position. As we do all computations online, and perform the passes in rapid succession, the clearances over the three passes differ while maintaining acceptable separation to the obstacle(s). For both manoeuvres, the entire system setup is identical with only the obstacles changed exemplifying our method’s adaptability to different tasks.

Fig. 15
figure 15

One planning instance (left) and full executed trajectory (right) with \(t_3>t_2>t_1>t_0\) for a perpendicular and diagonal MAVP-human path crossing. Smooth and agile planned and executed trajectories maintain a safe separation to the moving human obstacle (shown by the bounding ellipsoid)

Fig. 16
figure 16

MAVP follows a circular path avoiding two randomly walking human obstacles modelled as ellipsoids moving at a mean 0.78 m/s with max. 1.45 m/s. As shown in (c), the 0.2 m buffered collision avoidance limit is never violated. The system only intermittently enters the larger potential field ellipsoids with a 1.0 m buffer. In (d), observe the steady loop times for the full framework (mean 71 ms) which includes the NMPC controller (mean 46 ms); brief spiking arises from situations where significant re-planning was required

Of the 48 tests performed over both tasks, 77% were successfully executed. In the rare cases when the manoeuvre was not successful, the system would either end up in a deadlock in front of the obstacle or make a momentary contact with the obstacle. Flight was recoverable following the contact with only four tests where this was not the case. For our unsuccessful tests, the primary cause was traced to our local planning approach or inaccuracies in the model resulting in the discrepancy between the observed and planned motion.

As addressed in Sect. 3.5, local planning method such as ours are prone to deadlocks. In this particular case, one reason could be an insufficient planning horizon necessary to compute a feasible trajectory over/through the obstacle. Furthermore, our NMPC optimisation algorithm utilises the time-shifted previous solution as an initial guess for the optimisation problem thus reduce computation time, however, this strategy contributes to an increased likelihood of deadlocks. The latter can be mitigated at the cost of few longer computation cycles by randomising the initial solution when deadlock situations arise. An automated and structural method to address these situations within our local planning framework is an interesting topic to be addressed in future work.

The cases involving obstacle contact were only observed in the flight through an opening experiment as the margins of error were small. Our method enables us to mitigate the effects of long horizon planning inaccuracies by using observed system states to re-plan rapidly. However, in this case, the prediction accuracy over longer horizons is critical to ensure the planner only initiates an agile manoeuvre that is feasible over the full planned trajectory. In the rare case the planner experiences an intermittent infeasibility during run-time, such as system-obstacle contact, it will quickly recover stable controlled flight once feasible trajectories become available.

The setup was extended to the case of a moving high bar manoeuvre for agile dynamic obstacle avoidance (see video).

6.3 MAVP human obstacle avoidance

Obstacle avoidance performance is demonstrated amongst dynamic human obstacles with (i) test cases involving intersecting MAVP-human paths, and (ii) random motion in a shared MAVP-human space. The humans are represented by ellipsoids with buffers \(\beta _o=0.2\,\hbox {m}\), \(\beta _e=1.0\,\hbox {m}\), and are tracked to estimate their velocities for planning. The NMPC uses the real time-step, default cost weights and \(N=18\). Note that we define the MAVP’s closest approach to the human’s associated ellipsoid as the smallest value of either the quadrotor to ellipsoid, or load to ellipsoid distance.

6.3.1 One human with crossing paths

A human walks perpendicularly and diagonally on a path crossing the MAV performing a navigation task from \((-1.5,0,1.9)\) to (1.5, 0, 1.9).

In Fig. 15 we show a snapshot and the full executed trajectory for both cases. Observe the MAVP’s smooth, safe and agile execution of the task which includes the use of full spatial avoidance exploiting the available horizontal and vertical space around the obstacle (video shows this clearly). NMPC’s predictive capability means load is actively swung away from the human’s direction of motion to avoid a potential load-human collision. The minimum MAVP to human separations for the perpendicular and diagonal crossing task were 0.45 m and 0.61 m.

6.3.2 Two humans walking randomly

Two humans walk for 150 s in random directions crossing the MAVP’s path. The MAVP autonomously follows a goal position moving anti-clockwise with a 7 s period along a circle with radius 1.5 m and a constant 1.4 m height. Experiment was conducted in a larger workspace measuring \(3.2\times 3.2\,\hbox {m}\).

In Fig. 16 we show a snapshot from our experiment alongside the system’s closest approach to the humans and the framework/NMPC loop time. As shown in Fig. 16c, a safe distance is maintained by the MAVP from the humans with no collisions over the entire run; the minimum observed separation was 0.38 m from to the human’s collision avoidance limit. In some cases the humans were apprehensive about the MAVP getting too close so they would perform a precautionary evasive motion, however, as shown in Fig. 16c, the MAVP still always maintains a safe separation. To address observable closeness, it is possible to enlarge the obstacle associated buffers to increase the separation.

Observe from Fig. 16d that the NMPC solve time resembles the statistics obtained from the simulated study with two dynamic obstacles as shown in Fig. 9, therefore, the NMPC computation performance is preserved going from a simulation to the experimental setup. As the optimiser is initialised using the time-shifted previous solution, a roughly constant solve time is achieved. Spiking occurs when the optimiser’s iterative solver requires more time to compute solutions which primarily occurs when considerable re-planning is required. Examples where we observed spikes included situations where the humans would inhibit the NMPC planner from feasibly planning a path to go to the goal position, or the MAVP would be trapped. The spikes only lasted one to two time-steps so observations showed the overall performance was not degraded. Specific to experiments is a mean 25 ms overhead (on top of NMPC solve time) associated with the framework’s state estimation, communication and data parsing. The low overhead means controller’s performance is not severely affected.

Thanks to its online and receding-horizon nature, our method can execute agile and safe continuous manoeuvres and avoid dynamic obstacles such as humans. Our method is extendable to larger spaces with more humans/obstacles as we have already demonstrated in simulation with eight obstacles.

7 Conclusion

In this paper, we presented an optimisation based unified motion planner and controller to accomplish online, closed-loop and agile flight of a Micro Aerial Vehicle slung payload system. We formulated the optimisation objective function, constraints and relied on a state of the NMPC solver to achieve collision-free flight in dynamic environments over various complex tasks including flying through a narrow opening and avoiding moving humans. With simulation and experimental studies we demonstrate the method’s (i) scalability with the planning stages and the number of obstacles, (ii) robustness to different controller time-step durations and input execution lags, (iii) adaptability and repeatability over various complex tasks, and (iv) fast online performance in experimental conditions. For future studies we recommend the method’s extension to non-rigid cables, improving the model’s realism, accuracy and consequentially the NMPC prediction performance. Furthermore, a study involving variations of the model parameters would showcase the generality of the approach to different systems and setups. Also, due to our reliance on off-board NMPC control and motion capturing we limited our experiments to indoor spaces, however, with the controller frequency achieved off-board, we believe on-board computations would be feasible with hardware available today. Combining our method with contemporary obstacle detection, localisation and state estimation techniques could make urban MAVP operation a reality.