Disturbance Observer Based Control with Anti-Windup Applied to a Small Fixed Wing UAV for Disturbance Rejection

Small Unmanned Aerial Vehicles (UAVs) are attracting increasing interest due to their favourable features; small size, low weight and cost. These features also present different challenges in control design and aircraft operation. An accurate mathematical model is unlikely to be available meaning optimal control methods become difficult to apply. Furthermore, their reduced weight and inertia mean they are significantly more vulnerable to environmental disturbances such as wind gusts. Larger disturbances require more control actuation, meaning small UAVs are far more susceptible to actuator saturation. Failure to account for this can lead to controller windup and subsequent performance degradation. In this work, numerical simulations are conducted comparing a baseline Linear Quadratic Regulator (LQR) controller to integral augmentation and Disturbance Observer Based Control (DOBC). An anti-windup scheme is added to the DOBC to attenuate windup effects due to actuator saturation. A range of external disturbances are applied to demonstrate performance. The simulations conduct manoeuvres which would occur during landing, statistically the most dangerous flight phase, where fast disturbance rejection is critical. Validation simulations are then conducted using commercial X-Plane simulation software. This demonstrates that DOBC with anti-windup provides faster disturbance rejection of both modelling errors and external disturbances.


Introduction
The continual growth in the use of Unmanned Aerial Vehicles (UAVs) has generally been accompanied by a reduction in size and weight of the systems. This has led to small UAVs being widely used for a range of applications (e.g. remote sensing, mapping and traffic monitoring among many others [1]). This work is concerned with fixed-wing UAVs, which are generally operated outdoors. Operating outdoors exposes the aircraft to environmental disturbances such as wind gusts [2]. This is a challenge which has been studied since the very beginning of aviation [3] and many modern aircraft are fitted with Gust Load Alleviation (GLA) systems [4]. For large aircraft (e.g. Galaxy C-5a or Boeing 787), the GLA systems are used to reduce bending and torsional stress allowing for lighter structures and greater passenger comfort.
For small UAVs, however, wind gusts present a considerably different challenge. The significant reduction in size leads to lower inertia, making small UAVs more sensitive to disturbance [5]. Moreover, a reduction in aircraft size is generally accompanied by a reduction in operating airspeed. This has reached a critical point for small UAVs where their operating airspeeds are of the same magnitude as the gust disturbances they are subjected to [6]. As a consequence, in gust alleviation for small UAVs, structural loads are less critical condition than flight performance. This presents a different problem of disturbance rejection, with small UAVs having very different considerations. Flight control can be generalised into two categories; outer loop trajectory control and inner loop attitude control. UAV trajectory tracking in wind has been studied in literature, with a range of methods applied. Vector field guidance has demonstrated robustness to wind disturbance by utilising ground speed and course for navigation [7]. It was shown that path planning with a known constant wind can improve mission accuracy and efficiency [8]. By using pre-computed information of aircraft turning performance in wind, it has also been shown that path following in wind can be improved [9]. These methods demonstrate that using robust methods is feasible for trajectory tracking in wind. Using information of the wind improves performance further. The limitation being that accurate prior wind knowledge is not feasible, especially for gust disturbances. As small UAVs are highly affected by gusts, it should be considered in their operation. It has been shown that online estimation of steady wind can be obtained and used in trajectory following, with some ability to track variance [10]. Disturbance Observer Based Control (DOBC) augmentation has shown good performance in simulation and flight testing in rejecting disturbance of an unknown wind in trajectory tracking [11]. Inner loop control for UAVs is also widely studied in literature. However, work regarding disturbance rejection in this area is more sparse, particularly regarding external disturbances. Disturbance rejection for parameter uncertainty has been addressed by various approaches including robust methods [12], Neural Networks (NNs) [13], Support Vector Regression (SVR) [14] and Active Disturbance Rejection Control (ADRC) [15]. While these works were able to account for modelling uncertainty, they have no active consideration of external disturbance rejection. Linear Quadratic Regulator with Integral action (LQI) control has been applied to the problem [16]. This method is able to compensate for parameter uncertainties and external disturbances, through integral action on the outputs. This method is compared to DOBC in this work; as such its drawbacks are discussed in later sections where appropriate. ADRC has been applied to external disturbance rejection of a large UAV [17]. Although improvement was demonstrated, the effect of the disturbance was not completely removed from the output. DOBC has the ability to reject both internal and external disturbances, and has been applied to small UAV control [2]. Rejecting internal disturbances means modelling uncertainty, which is likely for small UAVs, is accounted for. Integral augmentation is also able to remove the effect of modelling errors in the output. However, DOBC also restores nominal performance in such cases, which is a feature not shared with integral augmentation. Integral augmentation is also able to remove the effect of external disturbances from the output; in this work we aim to further demonstrate that DOBC is faster in this regard.
In practice, every actuator has limited capabilities (e.g. a motor with limited torque, an elevator with limited deflection angle) [18]. Small UAVs are more prone to actuator saturation when compared to their full sized counterparts as the magnitude of disturbances relative to the aircraft state are much higher, requiring significantly more control deflection for rejection. Under actuator saturation, control performance degradation may appear such as large overshoot or limit circle; in extreme cases, it is possible for instability to arise. Actuator saturation can lead to the appearance of the controller windup phenomenon. With windup, internal controller states continue increasing in magnitude which can cause overshoot and instability, among other issues. Unlike conventional approaches which consider saturation in control design [19], an anti-windup compensator [18] is employed in this work allowing for nominal control performance recovery in the absence of actuator saturation. In this work, the classic static anti-windup [20] is exploited. This work is a continuation of the work published previously [2], aiming to further develop the technique toward the landing control problem.
In summary, small UAVs are more vulnerable to gust disturbance than full size aircraft. Furthermore, they are at greater risk of actuator saturation due to the increased relative magnitude of disturbances. Considering these issues in a situation such as landing, statistically the most dangerous flight phase [21], significantly increases the risk. The aircraft will be close to stall speed, so rapid rejection of disturbances is essential, Furthermore, not accounting for the issues brought about by actuator saturation increases the risk of aircraft loss. Small UAVs are also more likely to suffer parameter uncertainty; failure to account for this can also endanger the aircraft in critical situations. DOBC affords the ability to reject the effect of both external and internal disturbances. The antiwindup scheme proposed within removes the risk brought about by actuator saturation. This work aims to demonstrate these advantages.

Aircraft
This study is based on the Skywalker X8 shown in Fig. 1. The X8 is a popular commercially available airframe in the small UAV category, which has been used in a number of research papers [22][23][24]. The aircraft has no direct yaw control surfaces, whilst pitch and roll are controlled by elevons.

System Dynamics
The coordinate system for the model is shown in Fig. 2, where x B and z B represent body axes, x E and z E represent global axes. Moreover, u and w represent body horizontal and vertical velocities, V is the total airspeed, θ is the flight path angle, M is the pitching moment, δ is the control surface deflection and δ T represents throttle setting position. This force is presumed to act directly along the positive x B axis. The directionality of δ is such that a positive deflection This system is condensed into a state-space model of the form given in Eq. 1. Here, X, Z and M are the non-dimensional stability coefficients due to their associated subscripts, q is the pitching rate and g is the gravitational acceleration constant, x is the system state, u δ is the control input matrix and A and B are the system matrices. Any state denoted further with an * represents the state at the linearisation point of the model.
As u δ ∈ R 2×1 , only two reference commands can be accurately followed. For this work, based on the motivations discussed in the introduction, the states which are to be controlled are u and h. Therefore, the reference command r is defined as Subsequently, the output equation y can be defined as In this work, the A and B system matrices for the state space model were obtained from system identification applied to a model of the X8 in X-Plane. X-Plane is recognised in the industry as a good simulation environment [25] providing accurate data. X-Plane simulation is conducted based on a geometrical model of the X8 using blade element theory to determine the aerodynamic performance. Using flight data from this model, state space system identification methods were applied to obtain a model of the X-Plane X8. This method is representative of real world applications of the technique in this paper as a linear model has been obtained from a non-linear flying platform. It is expected that modelling errors are present which the DOBC will be expected to deal with.

Control and Performance Objectives
The control objectives for the system are to provide state regulation around the trim condition, while also allowing for tracking of a desired change in reference command. Furthermore, the controllers should be robust against disturbances, allowing for the control objectives to still be accomplished accurately. The performance objectives are defined to satisfy these control objectives.

Tracks u and h Around a Given Reference Command
This objective is concerned with the controllers being able to provide sufficient regulation about a reference command, as well as good tracking performance to a changing reference input. The initial reference command will be to maintain the trim condition for the aircraft, which is V = 15m/s and h = 300m. This objective is mainly concerned with the rejection of internal modelling errors. That is to say, the ability of the linear controllers to control a non-linear plant.

Maintain Safe Flight Under External Disturbances
This objective links with the motivation of the work; the ability of the controller to reject external disturbances and maintain a safe flight condition. This means providing rapid disturbance rejection to avoid aircraft stall or excessive loss of height; the two conditions which are most dangerous in the landing scenario.

Nominal Feedback Control Design
Linear Quadratic (LQ) control was selected for the control strategy, as it represents a well understood linear optimal control strategy. Three control schemes will be compared. A Linear Quadratic Regulator (LQR) will be used as a baseline for comparison to further controller augmentations. A reference tracking Linear Quadratic Regulator with Integral action (LQI) represents classical augmentation for robustness. Finally, an LQR with DOBC augmentation will be used. To ensure comparability, the LQI will be designed first; the baseline LQR is extracted from the designed LQI. DOBC augmentation is applied to the extracted LQR. This ensures that the state regulation provided by the LQR in each case will provide comparable performance, which allows for accurate study of the disturbance rejection of classic LQI augmentation when compared to DOBC augmentation.

Linear Quadratic Integral Control Design
Integral augmentation is the classical method for disturbance compensation. The LQI used in this work is based on previous research [26], and as such the full derivation is omitted for brevity; a brief overview is presented. Firstly, a new state e s is defined to represent the tracking error for the reference command given in Eq. 2 Secondly, define two new states, as the integral of this reference error The original system is expanded to include the new states, given as follows Now the system error can be defined as where Next, we modify the general LQ cost function by inserting the error system in Eq. 4; this is given in Eq. 5.
Where Q i and R are the state and control weighting matrices, respectively. By modifying the system to include error dynamics, the state regulation gains are chosen internally by the system once the error weighting Q i has been selected. Following the standard method, details of which can be found in [26], the Algebraic Riccati Equations (ARE) can be solved for the control law (6).
The controller layout is shown in Fig 3.
contains both the state feedback gain k x ∈ R 2×5 and the error integral gain k i ∈ R 2×2 of the form K x i = k x k i . The resulting optimal gains depend only on the system being modelled and the weighting matrices Q i and R. The introduced integral action will account for modelling errors and external disturbances, although it does introduce issues. Integral control is a lagging controller in the sense that an error must first exist for a period of time for the controller to generate feedback to eliminate it. Although this can Fig. 3 The diagram of Linear Quadratic Integral controller be made to happen quickly with an increased integral gain, this also introduces other problems such as overshoot, oscillation and control surface saturation; in reality this means that a high integral gain is not a feasible solution [27]. The DOBC method proposed in this work aims to eliminate these errors and improve performance.

Linear Quadratic Regulator Control Design
This section details the development of an LQR controller with reference tracking. Any matrix associated exclusively with the LQR will be denoted by a subscript r. For system (1), one can design an LQR for state regulation using the approach discussed in Section 3.1; this involves minimising the cost function given by where the selection of Q r would allow for tuning the weights placed on individual states by the optimisation routine. However, for a better comparison between controllers in this work, we continue in this section by applying the state regulation gain k x from k xi in Eq. 6. The control law for this regulator is then given by To include reference tracking, a new variable N is defined as part of the control law, which is the DC gain of the system where the DC gain is the value at which the transfer function from reference command to system output is 1. To determine the gain N, the transfer function G ry (from reference to output) for the closed loop system (1) under control (7) can be calculated, given by For a reference with steady state value, the DC gain matrix N can be calculated by choosing s → 0 and G ry being an identity matrix, which is given by The controller layout is given in Fig. 4. This figure highlights one of the main issues with this technique: no direct feedback on the reference error. This causes two problems. First, it relies on A, B and C being known exactly; any error will result in steady state error. The lack of reference feedback also limits the degree to which external disturbances can be rejected. In the following section, disturbance observer augmentation will be explored, which provides a parallel approach to LQI while enabling fast disturbance rejection.

Disturbance Observer Augmentation
Adding integral action to an LQR controller allows for disturbance rejection through feedback regulation; this approach can remove the effect of nearly constant disturbances in steady state but comes at the price of nominal performance degradation such as overshoot and control saturation in the transit process [27]. DOBC has recently received much attention in both academia and industry [28] due to its promising features such as the preservation of the nominal control performance and the "separation principle" for the ease of control design [28,29]. DOBC is usually patched into an existing baseline controller. The baseline controller addressing stability and performance specifications. The DOBC augmentation is used to reject disturbances and return the baseline controller to nominal performance. In this paper, DOBC is exploited to achieve external disturbance rejection control for small UAV, where the LQR discussed in Section 3.2 serves as the baseline optimal control for nominal performance. DOBC will also be used to eliminate model uncertainty from the system. The detailed design procedure of the DOBC is given as follows.

Observer Design
A key feature of DOBC is the ability to account for modelling uncertainties as well as external disturbances. First, we add the external disturbances to the system, which transforms (1) to (8).
Where A t and B t are the true matrices which would ideally describe the system being modelled, B d = I 5×5 as the disturbances are presumed to act directly on the states, and the external disturbances d x are defined as No disturbance is considered on θ as it is a known kinematic relation to q. In our case we define d lx , the lumped disturbance term, as the sum of the external disturbances and the modelling errors, It is defined that the lumped disturbances act in the same channels as the external disturbances, so B ld = B d . This demonstrates how modelling errors are accounted for as the difference between the true, unknown, matrices (A t , B t ) and the identified matrices (A, B) used for control design. Modelling errors are presumed to be present as it is prohibitively difficult and expensive, if possible at all, to obtain the true system matrices for a small UAV. Furthermore, even the ideal system would become inaccurate as the aircraft departs from the linearisation point, which is expected during operation. Using DOBC accounts for these issues. With the disturbances defined, it remains to design an observer which can estimate them. Such an observer is given in Eq. 10 [30].
where, z represents the observer internal state, L is a gain matrix which can be tuned for performance and d lx is the estimate of the disturbances d lx . To demonstrate the ability of the observer estimate to track the disturbances, we first define an estimation error (11) between the estimate and true disturbances. Substitutingd lx from Eq. 10 and then expanding theż term further from Eq. 10, the expansion ofė d can be reached.
Under the assumption that the external disturbances vary slowly, we can setḋ lx ≈ 0. This has two results; firstly, it is demonstrated that the error dynamics are stable if −LB d is Hurwitz. Secondly, the effect of L can be seen more clearly. Assuming L is chosen as Hurwitz and recalling that B d = I 5×5 , it shows that a larger value of L will result in a more rapid convergence of the error estimation. Although the error dynamics are proven stable for a steady disturbance, it has been demonstrated [31] that the observer can track time varying disturbances as long as the observer dynamics are faster than that of the disturbance. This completes the observer design process. It remains to design a strategy to generate appropriate control inputs to mitigate the effect of these disturbances on the system.

Disturbance Compensation Gain
The disturbance observer employed in this work is of the "mismatched" form [30]. This condition is defined by the mismatch in dimensionality between u 2×1 δ and d 5×1 lx in Eq. 8. More colloquially, this condition results from not having a direct control input to address each disturbance channel. The result of this being that it is not possible to remove the disturbances from all the states. It is, however, possible to remove the disturbances from the output channels. Based on the disturbance estimate given from Eq. 10 and the control law (7), a composite control structure is produced where k dx is the compensation gain to be designed. As shown in [30], k dx can be found from Eq. 13 It can be seen that the compensation gain does not require tuning independently from the closed loop system under LQR regulation. With the appropriate gains calculated, the disturbance observer can now be integrated into the system architecture. The complete LQR + DOBC system is shown in Fig 5.

Anti-Wind Up Modification
In this section, anti-windup compensation is further considered for the proposed DOBC, where the classic static anti-windup structure [20] is exploited in this paper. This approach has received much attention in practice [18] due to its desirable properties such as nominal control performance recovery. Consider system (1) with saturated actuator and disturbanceṡ A classic static anti-windup modification [20] for system (14) is made to the observer (10), given as follows where k a S u is a term introduced to reduce the effect of input saturation, with k a being the anti-windup gain Remark 1 In this paper, the classic static wind-up compensator is augmented to DOBC; the stability analysis of the modified DOBC under input saturation is omitted for brevity. However, it can be proved that under the modified DOBC, the closed-loop system is asymptotically stable with a basin of attraction using the result in [20].

Remark 2 Under the anti-windup gain matrix in
Eq. 17, the modified DOB (15) can be put into an equivalent form Although the anti-windup compensator is designed using the classic result [20], comparing the conventional DOB (10) and the modified DOB (18) it can be seen that the modified DOB ends up directly using the saturated control input sat (u δ ) rather than calculated input u δ , which substantially eases the implementation.

Simulations
Results from numerical state space simulations are presented first to compare the characteristics of the different controllers in Section 5.2. For full comparison, both the anti-windup augmented and original LQR + DOBC controllers are included. Beginning with numerical simulation allows for study of performance in the presence of external disturbances only.
Then, in Section 5.3, X-Plane simulations are conducted. This aims to represent real world application of the systems as no exact model of the aircraft is available; an approximation was obtained through system identification. Furthermore, the simulation is nonlinear entirely. This represents a difficult case for the DOBC as both internal and external disturbances will be present, as well as unmodelled actuator dynamics.

Controller Tuning and Gain Selection
Some preliminary work was done to identify a good baseline LQR/LQI controller, for which the simulations are not included here for brevity. As the LQR controller is based on the LQI, there are only 2 tunable parameters; Q i and R as given in (19). More weighting was applied to the error in u, as there is a much smaller tolerance in magnitude of error for this.

Numerical Simulations
Conducting initial numerical simulation allows for study of controller performance with total control over the disturbances. No internal modelling disturbances will be present as the simulation model is identical to the model used for control design. External disturbances will only be present when explicitly enabled.

Step Disturbance Performance
In this section, disturbance rejection for a range of disturbances are demonstrated while maintaining the reference condition. The applied disturbances are summarised in Table 1. For the first 2 disturbances (w d and q d ) the DOBC schemes show significantly improved performance. As no saturation is present, the anti-windup scheme does not activate. The h d disturbance was chosen to be of a magnitude greater than the system is able to reject to bring about control saturation. This highlights the performance of the anti-windup scheme well as the unmodified DOBC suffers overshoot of both reference commands due to the windup once the disturbance is removed; the antiwindup modified observer does not suffer any such issue and returns to the reference command easily. In all 3 cases, the anti-windup observer controller has significantly improved disturbance rejection over the LQI scheme (Figs. 6, 7, 8, and 9).

Height Reference Tracking with Disturbance
A step change in reference height is now commanded; the step input is passed through a shaping filter to smooth the command. A disturbance q d = −22 added to the simulation at t = 20s. The results are given in Figs. 10, 11, 12, and 13. This disturbance is chosen to represent a wind disturbance during a landing scenario. As the aircraft approaches ground level, the disturbance is applied. It can be seen that all controllers except the anti-windup DOBC undershoot both u and h reference by significant amounts. The unmodified DOBC offers some initial disturbance rejection but quickly degrades to performance worse than the LQI; this highlights the significant and important improvement offered by the anti-windup modification. If this were a landing attempt, only the LQR + DOBC (aw) would have successfully completed the manoeuvre with no undershoot or steady state error.

Varying Disturbance Rejection
The stability property for the DOBC controller given in Section 4.1 assumed a steady disturbance. The previous simulations have shown good performance under these conditions. However, it has been demonstrated in literature [31]  anti-windup action for the DOBC. It can be seen that the DOBC offers significantly improved disturbance rejection for the varying disturbance over the other controllers, whilst not suffering any overshoot or windup issues.

X-Plane Simulation
To verify the results obtained in the state-space simulations, the controllers were further tested using X-Plane flight simulation software, which is recognised for good simulation fidelity [25]. Figure 18 shows the system used to obtain data. For maximum X-Plane and Simulink fidelity, the two environments were run on separate computers. A UDP data link was used to stream the aircraft state and control commands between the systems. The communication network delay was measured as < 1ms, meaning no discernible delay was introduced using this method. The intent is to demonstrate that the DOBC can still offer performance improvement in this environment. This is  Fig. 18 Simulation environment used for X-Plane data collection. Separate computers allowed X-Plane and Simulink to run at their ideal rates. Network delay was below 1ms representative of real world application of the system with the linear model, obtained by system identification, is applied to a non-linear simulation. For these simulations, R was increased to reduce the total gains for the controllers. This was required to maintain stability in the simulation. Furthermore, L was decreased to maintain stability. The penalties were applied equally to all controllers to retain comparability. In these simulations, 3 separate disturbances were applied as detailed in Table 2 Again, the anti-windup DOBC offers significant performance improvements over the other controllers. The unmodified DOBC suffers from significant undershoot and overshoot of the reference following a period of control saturation. As shown in the numerical simulations, only the anti-windup DOBC is able to reject all the disturbances and reach the desired landing reference.

Conclusions and Future Work
This work has shown that both integral and disturbance observer augmentation of a baseline LQR can improve disturbance rejection performance. However, it has been demonstrated clearly that the disturbance observer augmentation provides faster and more accurate disturbance rejection. In situations where control saturation occurs, DOBC performance can become degraded, in some cases performing worse than the LQI. By augmenting the disturbance observer with an anti-windup scheme based on classical static antiwindup compensation, the adverse effect of controller windup due to actuator saturation was removed from the system and nominal performance restored. In situations with no actuator saturation, the proposed anti-windup scheme had no effect on performance of the DOBC, which is a favourable result. It was also demonstrated that the linear disturbance observer was able to replicate this performance when applied to a non-linear simulation under adverse conditions. The non-linear simulations demonstrated that the designed linear observer resulted in better disturbance rejection than the integral augmentation in 3 key situations: steady state at the trim condition, during a changing reference and in steady state at a condition well away from the linearisation point. This demonstrates that a basic linear model is suitable for designing a baseline LQR controller which, when augmented with a disturbance observer, can be applied to non-linear plant with superior performance to the traditional integral augmentation.
The results presented here have demonstrated that classical integral augmentation does not provide disturbance rejection which can match the disturbance observer for response rate and therefore flight safety. This is a promising result for future application to a landing scenario to improve flight safety. Furthermore, it has been shown that the employed anti-windup scheme is able to remove the risk of controller windup due to actuator saturation which resulted from the unmodified disturbance observer.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.