AL-TUNE: A Family of Methods to Effectively Tune UAV Controllers in In-flight Conditions

In the paper, a family of novel real-time tuning methods for an unmanned aerial vehicle (UAV) altitude controller in in-flight conditions. The methods allow the controller’s gains to be adapted only on the basis of measurements from a basic sensory equipment and by constructing the optimization cost function in an on-line fashion with virtually no impeding computational complexity; in the case of the altitude controller as in this paper for a hexacopter, altitude measurements were used only. The methods are not dependent on the measurement level, and present the approach in a generally applicable form to tuning arbitrary controllers with low number of parameters. Real-world experimental flights, preceded by simulation tests, have shown which method should behave best in a noisy environment when e.g. wind disturbances act on a UAV while it is in autonomous flight. As the methods can potentially be extended to other control loops or controller types, making this a versatile, rapid-tuning tool. It has been shown that a well-tuned controller using the proposed AL-TUNE scheme outperforms controllers that are tuned just to stabilize the system. AL-TUNE provides a new way of using UAVs in terms of adaptivity to changing their dynamic properties and can be deployed rapidly. This enables new applications and extends the usability of fully autonomous UAVs, unlike other tuning methods, which basically require the availability of a UAV model. The core difference with respect to other research from the field is that other authors either use a model of a UAV to optimize the gains analytically or use machine learning techniques, what increases time consumption, whereas the presented methods offer a rapid way to tune controllers, in a reliable way, with deterministic time requirements.


Motivation
Precise tuning of a controller determines the on-flight properties of any UAV architecture. High-quality tracking usually requires the use of nonlinear controllers that need information about the dynamics of the UAV. A model of UAV is therefore needed. This usually involves the use of high computational powers on-board, unlike the approach proposed here. As a solution, fixed-parameter PD-type or PID-type controllers can be used (see [26,33]). However, these require proper tuning.
Well-tuned fixed-parameter controller gains result in high performance in UAV applications, where there is no need to perform any real-time adaptation. However, in aerial manipulation applications [6,41], such as gripping elements and lifting them from the ground (see Fig. 1), the dynamics of the UAV change, and the controller parameters need to be altered. In many UAV missions, precise altitude Fig. 1 Examples of the deployment of our system of autonomous UAVs and teams of UAVs in pick-and-place and car-landing applications in various environments that motivated the research presented in this paper. a object grasping in challenging desert conditions, b precise landing on a moving car, c object grasping in a green field, d UAVs cooperatively collecting objects in a desert environment https://youtu.be/ ns8n8xEMxD8, e successful deployment of the system in the MBZIRC 2017 competition http://mrs.felk.cvut.cz/mbzirc, https://youtu.be/ogmQSjkqqp0, f landing on a moving car during the competition with fine-tuned controller gains https://youtu.be/ zHs5LtLyBsc controller action is of prime importance, e.g. during the take-off and landing phases [10], and for a smooth touchdown phase. While grasping objects [25,35,39], the UAV is influenced by interference from the ground effect, and often by wind gusts. Manual tuning of one set of controller parameters takes dozens of trials and days of outdoor flying in a range of environment conditions (see examples of system deployment in various environments and flight conditions in Fig. 1). In addition, grasping any objects requires a new set of controller gains for various configurations of the gripping arms during transport tasks [41]. This is an even more significant problem for multi-UAV systems, where, in principle, each vehicle has to be tuned separately. This in fact inhibits the use of large teams of autonomous UAVs in tasks where interaction with the environment is required, for example, in the pick-and-place challenge, and in cooperative construction assembly. In fire fighting, the UAV model has to be changed due to release of the extinguishing medium [28]. These requirements have motivated the authors of the paper to introduce a family of methods for controller tuning. The aim is to achieve superior results (with prescribed tolerance) much more quickly that with traditional approaches. This is a major contribution of our work, and it could significantly extend the application domain of UAVs. The proposed methods differ in algorithm picking intermediate search points, but can all be thought as different instances of the same general procedure, what is the main reason for presenting the family of methods in a single paper.

Related work
Today, the applications of UAVs [7] range from routing, data gathering, cellular communications, Internet of Things (IoT) networks, even to disaster management, to extend the capabilities of other techniques. All these cases require proper controller tuning to ensure good performance. Many model-based and model-free approaches to controller tuning have been developed. A review of 51 methods applicable for PIDs can be found in [31], and a range of tuning methods for fixed-parameter controllers are presented in [29].
As computational power has increased in recent years, a tendency has been observed to use meta-heuristics for the controller tuning problem. In the context of UAVs, bio-inspired approaches have been attracting major attention. Since the beginning of the 21st century, at least 30 important new bio-inspired optimization methods have been developed. Several of the methods, accompanied by classical algorithms such as Particle Swarm Optimization (PSO), the Ant Colony Optimization, the Bat-inspired algorithm, the Artificial Bee Colony, etc., have been characterized, e.g. in [14] and in [16], from the viewpoint of application in UAVs. Paper [14] presents the results of research on PSO for a controller used in an unmanned rotorcraft pendulum and on multiple UAV formation control, and multiple UAV/UGV and heterogeneous coordinated control with the use of bio-inspired techniques.
Recent papers on tuning UAV controllers fall in three main categories: fuzzy logic applications to maintain selftuning capabilities, adaptive gain scheduling approaches, and classical adaptive pole-placement techniques. References [1,15,36] present the adoption of fuzzy logic rules for tuning controller parameters. A basic disadvantage is that expert knowledge is needed for constructing the inference rules, preferably on the basis of a model. References [15,32] adopt the gain scheduling approach, where a programme of gain changes is obtained that is derived for models for multiple operation conditions. Finally, in [12,42,44,45], a model is used to determine appropriate controller gains in order to obtain nominal performance. As can be seen, all of the approaches inherently need a model. One can also make an another division, namely the model is used to tune gains on the basis of some analytical calulations, or using other criteria, or by using machine learning techniques [38], with artificial neural networks. Such an approach is greatly time-consuming, whereas in our solution, one gets a rapid way of controller tuning in a reliable way, with deterministic time requirements, related to tolerance.
As stated in [17], on the basis of the Cuttlefish Optimization Algorithm, the new bio-inspired algorithms also have the potential to work as batch methods for tuning an altitude controller based on a UAV dynamical model. However, a model of the UAV is still required. In addition, all the methods mentioned above are still inapplicable in real-time conditions or using an on-board UAV computer. On-board, embedded systems are still too slow to allow successful in-flight optimization.
The short characteristics and grouping of approaches in the field has been given in Table 1.

Contribution and organization of the paper
In view of these drawbacks of methods that have been introduced here, our investigation concentrated on developing a simple and applicable method for tuning or fine-tuning the gains of standard fixed-parameter controllers, e.g., PD or PID, for the UAV community, and on presenting them for tuning an altitude controller precisely. Fixed-parameter controllers are widely used in commercial and open source flight controllers (e.g. Pixhawk, Naze32, CC3D Open Pilot). However, due to the wide range of UAV applications, it is important to perform tuning in flight conditions, without needing to model the dynamics of the drone, using an iterative approach. To the best of the knowledge of the authors, the currently available commercial solution is for PID autotuning for roll, pitch and yaw angles (CleanFlight's G-Tune) for Naze/Multiwii platforms. Literature searches have not revealed any results for iterative autotuning altitude controllers that can be performed during flight, or the iterative learning is based on knowledge of a model, as in [37] or in [13].
Yet other tuning techniques may include hybridization approach including [3] fuzzy logic approach combined with with MRAC MIT-oriented tuning, which inherently is limited by the MIT rule and its local convergence, or by building knowledge concerning the UAV on the basis of vision-gained information [2], to improve the performance of the system.
The AL-TUNE scheme presented in this paper performs automated tuning of controllers, using single-variable deterministic optimization methods for a given tolerance of the solution and the selected cost function, on the basis of an experimental data. The name AL-TUNE originates from the application, i.e., altitude controller tuning. For an initial discussion, see [19]. AL-TUNE consists zero-order branchand-bound algorithms [34], modified and applied here to the tuning task (tuning a hexacopter's altitude controller with AL-TUNE in a real-world experiment, the RTK GPS system is used to evaluate the performance of the method). This new approach is very attractive, as it can be used in a range of on-flight tuning problems. This paper presents a family of novel methods for realtime model-free optimization-based tuning of controllers, with experimental verification during flight to tune a PIDtype altitude controller in a hexacopter UAV. The methods can be used to tune general controllers with a small number of parameters in an orderly manner, with gain bounds and with defined tolerance. This type of automated tuning is based on measurements only. It has a low computational cost, and it reduces the tuning time dramatically. The proposed tuning method needs no analytical model of the UAV, and takes no modeling issues into account. This makes it a simple and effective tool, unlike the methods available in the current literature. The possible absence of a model, combined with low computational complexity, rapid action and the advantages mentioned above, forms the novel contribution of the paper. An analysis of experimental records (ROS bags) obtained during preparations for the MBZIRC 2017 Treasure Hunt and Autonomous Landing  [11] G NN nonlinear [5] Q ACO pole-placement RST [37] Q ILC nonlinear [13] Q Lyapunov-based MRAC nonlinear [33] Q model-based PID [4] Q MRAC/MIT pole-placement RST [7] Q MRAC LQR [38] Q NN nonlinear [14] Q PSO, ACO PID [16] Q PSO FOPI [17] Q PSO-like PID et al [26] Q various nonlinear [29,31] Q various PID [3] T FL, MRAC pole-placement RST [2] T vision-based model nonlinear [12] F FL nonlinear [32] F FL PI challenges, where the precise altitude control was the crucial behaviour for success, have shown that manual tuning of a single set of control parameters for a single UAV required a mean time of ca. 45 minutes. In addition, the configuration parameters of the AL-TUNE scheme are easilyinterpretable and provide the required performance of the optimization algorithm. This is a major advantage of the AL-TUNE scheme, enabling a streamlined configuration in various potential applications. When prototypes of the algorithms were begin developed, a combination of the Robot Operating System (ROS) environment and the Gazebo environment was used to verify the zero-order methods. These initial Gazebo simulations are crucial for testing the correctness of the methods, for implementing, and for integrating them into a complex UAV system. However, in order to obtain real values, field experiments in real-world conditions are necessary.
The model of the UAV simulated in Gazebo reflects the dimensions, the specification and the parameter of a real UAV, together with all software solutions, the impact of the environment on the UAV, and fusion of the readings from sensors. This allowed extensive initial simulation tests to be performed in order to identify the strong and weak points of the methods, which were finally verified by means of real experiments.
The paper is structured as follows: Section 2 presents the software tools used to perform the initial tests on the methods are described, together with insights into the architecture of the UAV used to conduct experiments, in order finally to verify the applicability of the methods. Section 3 describes the novel on-line adaptation mechanism proposed in the paper. Section 4 presents the family of optimization algorithms, and Section 5 refers to simulation results on a large data set, and the experimental results are presented in Section 6. Finally, conclusions and comments are presented in Section 7.

UAV Architecture and Software Tools
The position control system must be capable of ensuring good performance for large deviations from the hover configuration, or capable of compensating heavy wind. The proposed tuning scheme is used here to tune a pair of gains of a PID-type fixed-parameter altitude controller, with a small integral gain value, ensuring precise reduction of the steady-state position error. The proportional and derivative gains are tuned here to ensure superior performance in dynamical conditions, as these gains play a major role in obtaining good transient behaviour.
A nonlinear controller is used with the torque m ∈ R 3 and thrust force control f T ∈ R as inputs, see [24] and [27] where J ∈ R 3×3 is the inertia matrix with respect to the body frame, R ∈ R 3×3 is the orientation of the center of mass matrix, R c ∈ R 3×3 is the commanded value of the inertia matrix with respect to the body frame, Ω ∈ R 3 is the angular velocity vector in the body fixed frame, Ω c ∈ R 3 is the commanded angular velocity vector in the body fixed frame,ẍ d is the desired acceleration, and k P , k D , k R , k Ω are positive definite terms (see [27]).
The e x , e v , e R , and e Ω denote, respectively, position, velocity, orientation, and angular rate errors, with the subscript · C denoting the commanded value. As mentioned above, in our approach, the gains k P and k D are subject to tuning.
The experiments were conducted on a customized micro aerial vehicle platform, composed of DJI hexacopter frame F550 and E310 DJI motors, UAV PixHawk Autopilot, Real Time Kinematic GPS, the Rangefinder TeraRanger One sensor, used for obtaining altitude information when gathering data to obtain the cost function value during optimization, and an onboard Intel NUC-i7 PC (see Fig. 2). The system would also work with a standard GPS, as the GPS drift is slow, and the short-term position error is negligible when calculating the cost function values. In laboratory conditions, any motion capture system can be used in a similar way, e.g. VICON, OptiTrack, but the authors have focused on real-world applications. The proposed tuning scheme has negligible computational demands, as it is only necessary to collect a pair of cost function values and to pick two intermediate points by a simple inequality rule. This platform was developed for the Mohamed Bin Zayed International Robotic Challenge 2017, where its good performance was proven in the Treasure Hunt and in the Autonomous landing task on a moving car with a UAV [10,39]. It should be stressed that such a powerful PC is not required, as stated above, for running the proposed methods. However, it was used for testing purposes, where it provided a detailed analysis of the results, Fig. 2 The DJI F550 test platform as a full ROS bag with complete sensory data can be processed and stored on-board in real time using this setup.
The position/altitude control system uses a mature MPC-based system for transforming a given trajectory primitive into a feasible trajectory primitive, ensuring a smooth transition along the reference. This model predictive control (MPC) approach modifies the reference trajectory by minimizing the cost function related to the fourth order derivative of the position, as the control inputs (torques) depend directly on it, see [8,9]. The MPC scheme modifies the reference primitive which is to be tracked in our experiments by the altitude of the UAV. The modified reference primitive defines the desired dynamics of the UAV, and is indirectly related to the gains obtained from the AL-TUNE scheme. This is what motivated the authors use 3 different reference primitives in the initial simulation tests. The aim was to obtain a set of controller gains for the required dynamics of the control system.
For more information about the hardware that was used in the experiments see [35], with a detailed description of the hardware and software solutions, including path planning, low-and high-level control, etc. In addition, an extended description can be found in [35] of the implemented and tested precise multi-level estimator of the altitude of the UAV, which fuses data from GPS and from the laser Rangefinder sensor to enable smooth flight above the vehicle and landing on the vehicle.

Preliminaries
The family of optimization methods considered here allows tuning to be performed in cases when the model of a UAV is incomplete or undefined. The cost function for optimization purposes mirrors the performance quality of the control system, and is calculated iteration-by-iteration in a recursive fashion. Obviously, the global minimum of the cost function is unknown. It is therefore necessary to formulate the stopping criteria of the iterative tuning algorithm, with the use of which the optimized controller parameters are obtained. The algorithms are designed in such a way that the amount of time required to obtain final controller parameters of a defined quality is deterministic, and can be evaluated prior to the tuning process. Consecutive steps of the tuning algorithm require no information apart from altitude measurements at equally-spaced sampling periods. The computational complexity of the tuning algorithm is extremely low, due to the use of branch-and-bound algorithms, also termed zero-order algorithms, to find the solution.
The zero-order algorithms are based on finding the minimum of the cost function f : R → R inside the feasible range of a parameter x (0 − ) , x (0 + ) with x (0 + ) > x (0 − ) of the (unimodal) cost function f . The argument of this function, in the one-dimensional case presented here, can be taken as the gain of the controller. If the cost function obtained by means of measurements/experiments is not unimodal, the algorithms stop at a local minimum. Now, let a unimodal function f be given with the range x (0 − ) , x (0 + ) , and let us suppose it is obtained on the basis of measurements and results from some mathematical derivations, e.g., by measuring signals and incrementing some control-based cost function in the sampling periods. To find the minimum, two interior points for an evaluation of cost function f have to be taken inside the current range of a tuned parameter, in order to reduce this range in the next step. A symmetrical choice of the two interior points is proposed.
The algorithm for finding the minimum of unimodal oneargument cost function f can be summarized as follows: -derive the number N of iterations necessary to reduce the considered range for a parameter of f for the chosen tolerance , related to the length of the initial range; assume that the iterative solutionx * satisfies |x * −x * | ≤ ( x * is the actual (unknown) minimum, and superscripts (0 − ) and (0 + ) denote the left and right limits of the considered range in the 0-th iteration, respectively; -for iterations k = 1, . . . , N: 2) reduce the range to D (k) , where: 3) increase iteration counter k := k + 1, -terminate the algorithm; the optimal iterative solution iŝ . When the number N of required iterations with the given tolerance is calculated, then for k = 1, . . . , N it holds that: and ρ k defines the contraction ratio, and γ k defines the reference range (this pair of equations can be thought as a general view of a family of the proposed methods).

Proposed iterative zero-order in-flight tuning algorithms
The general rule for selecting a pair of new values for parameters (3) and (4) describes the family of algorithms that use different definitions for γ k , and, as shown below, offer different performance in a noisy environment. In the paper, and during the experimental phase of the research, the following algorithms are considered (for a thorough description and for a comparison of their performance in computer tests, see [20,21]): -Fibonacci-search method (FIB), -golden-search method (GLD), -the equal division method (EQL), -the dichotomy method (DIC) in two configurations. Table 2 presents a summary of characteristic parameters taken by the zero-order methods, where 0 < δ 1 is a small number, F k denotes the k-th Fibonacci number, starting with F 0 = 1, F 1 = 1, and for FIB, GLD, and DIC it holds that γ k = x (k−1 + ) − x (k−1 − ) , whereas for EQL the range by a constant factor related to the length of the initial range is reduced, i.e. γ k = γ 0 . It will be shown below that this difference gives rise to various contractions of the ranges along the iterations, making some of the methods more suitable in a real-world environment.
In the FIB, EQL and GLD methods, one of the points selected at the k-th iteration is known from the k − 1-th iteration, due to the rule for selecting ρ k . This can be used either to average the cost function values between iterations, e.g. to reduce the impact of external disturbances such as wind gusts on the current value of f in the k-th iteration, or to reduce the running time of the algorithm by skipping the current point, as the value of f has already been obtained.

Presentation of the Scheme
The final optimized values for the gains of a two-parameter controller can be found easily on the basis of the methods presented in the previous section. The running time of the tuning procedure is a function of the tolerance , expressed as the relative length (i.e., with reference to the initial range) of the final range within which the actually final, though unknown, gain is, and the number of bootstrap cycles N b defined by the user. In the application considered here, the bootstrap cycle is composed by solving two consecutive optimization tasks on the basis of measurements (incremental calculation of f ). One of the parameters is changed at a time, keeping the remaining parameter constant, either with the value initially stipulated by the user, or with the value resulting from the previous bootstrap. In this way, the first parameter is tuned first, keeping the second one constant, and then the second parameter is tuned, keeping the first one constant. Of course, the proposed AL-TUNE scheme can be extended to a greater number of parameters, which results in longer convergence of the iterative minimizer. A good solution for controllers with a greater number of parameters is either to split the tuning into stages, where independently working controllers, e.g. in angle and rotational velocity control loops, are tuned separately. Another effective approach is to fix the gain of the controller, tuning the remaining gains, and to repeat the experiments for a discrete set of the gain that has been defined a priori. The behaviour of the proposed algorithms can be interpreted from the viewpoint of the steepest-descent method [20], as the local minimizer obtained in the termination phase of the bootstrap is used to start the next iteration.
In a two-parameter framework, the vector of parameters x = [x 1 , x 2 ] T of the cost function f (x) related to the performance of the control system can define the gains of the controller. From the tuning viewpoint, the cost function reflects the degree to which the gains are tuned, and increases for mistuned gains. The following assumptions are made: ], i = 1, 2, are known, and are usually related to the ranges of safe operation of the UAV (these come from initial tests carried out with the UAV or as a span of the values of the controller gains, including the default controller gains, in their vicinity), -a zero-order method is used to findx * , and is based on bootstraps and algorithms from Section 3, -the cost function f may be related to a measure of the performance of the control system; here it has been assumed to measure the sums of the absolute tracking error samples in a horizon N ref of the selected number of samples, related to the period of the reference primitive r alt . The 1-norm tracking error function has been used since the measured errors given in meters are relatively small. This, in addition, shows the great robustness of methods that identify the minimizer on the basis of noisy measurements, and imposes a linear penalty on increasing error amplitudes. In this way, the cost function, which is in fact the performance index, is incremented in each sampling period T S as a linear function of the current absolute value of the tracking error,  In each of the methods, the cost function has been calculated as presented above, processing the values of the low-pass filtered altitude tracking error in the defined time horizon of 50 samples, with a sampling period of 0.05 sec. After the performance index (cost function) has been gathered, the optimization step is carried out either to change the parameters, or to reduce the range for a parameter, followed by several samples, here: 10, allowing transients to decay whenever the controller parameters are changed.
At each step of the zero-order algorithm, two interior points in the range considered in the prior iterations are taken, related to two different controller gain values. After the values of the cost function have been gathered for a pair of points, a decision can be made on how to reduce the range. For all the methods with the exception of DIC, one of the points starting from the second iteration has already been evaluated. On this basis, as stated in the previous Section, the algorithm can be further modified either to take

Discussion of Optimality
First, let f (x 1 , x 2 ) be a unimodal function within admissible ranges D  (x 1 , b) and f (a, x 2 ) are, respectively, unimodal. At any stage of a bootstrap sequence, the minimized function is treated as a one-argument function, denoted henceforth as g(x), where, without loss of generality, x = x 1 or x = x 2 . Since g(x) has a unique minimizer within the range [α, β] (α < β), the following α ≤ x (1) < x (2) ≤ x * implies g(x (1) ) > g(x (2) ), and x * ≤ x (1) < x (2) ≤ β implies g(x (1) ) < g(x (2) ). It is assumed that g(x) is calculated/measured with finite accuracy. In this onedimensional search, g is evaluated at N distinct points x (1) , x (2) , . . . , x (N) , where the placement of x (k+1) is a function of g(x (j ) ) for j ≤ k. The search procedure terminates when successive reduction of the initial interval is achieved. Let us assume that k function evaluations have been performed, and x (k) Now, as g(x) is unimodal, the true minimizer x * must obviously lie within [x , which also holds after N iterations.
Performing the same procedure in the second stage of any bootstrap sequence yields an improved estimate of the optimal point in a 2D search space, according to the unimodularity assumption of the function. The procedure stops when the rectangular range has relative dimensions below .
It is now obvious that for the proposed method for a unimodal function f (x 1 , x 2 ), the bootstrap sequences result in the steepest descent-like behaviour with the search direction in the k-th iteration orthogonal to the direction from the k − 1-th direction in each bootstrap. In the case of multimodal f , the function g is multimodal too, and the obtained minimizerx (N) may be local, resulting in the minimizer after N iterations in the following state of the bootstrap sequence again being local, with an improved cost function value with respect to the previous bootstrap stage, which simply results from the decision that is taken when selecting a new pair of points. This is a common feature of all single-pass optimization algorithms, being the cost of a very low computational complexity connected with: gathering of samples to calculate the performance index, and comparing its values across the iterations only.

Simulations with an Accurate Model
In order to evaluate the proposed methods, an extensive series of simulations was performed. The sum of absolute tracking errors performance index (here: cost function f (k P , k D )) was averaged in a grid of 24 × 24 combinations of two controller parameters from the safe range. This data was gathered in a long time horizon, averaged over 20 periods of the reference signal for each of the 576 combinations. In this way, the surface of the performance index was obtained, presented in Fig. 5a-e in a logarithmic scale. The choice of the scale followed from a very flat graph of the performance index surface, which would be very problematic for other non-zero-order algorithms.
Having collected the values of the performance indexes, three series of tuning experiments were conducted for fast (F, v max = 10 m sec , a max = 100 m sec 2 ), medium (M, v max = 3.33 m sec , a max = 11.11 m sec 2 ), and slow (S, v max = 1.67 m sec , a max = 2.78 m sec 2 ) reference primitives in the form of lowpass filtered square waves with a period of 12 sec, depicted in Fig. 5a-e in green, blue and red, respectively.
During all simulations, it was assumed that: -2 bootstraps are performed, -k P is sought in the range [4,12], and k D in [2,7], -= 0.05 in DIC1, GLD, FIB, or = 0.125 in EQL, DIC2, -δ = 0.0625 in EQL, δ = 0.01 in FIB, δ = 0.04 in DIC1, and δ = 0.1 in DIC2, -a total of 48 iterations are required in FIB, DIC2 and GLD (6 iterations per half of a bootstrap), while 56 iterations are required in DIC1, EQL (7 iterations), -the initial values for the second parameter (k D ) take the following values: 1, 3, 5, 7, 9, -for all methods apart from the DIC family, the algorithms used the calculated/averaged cost function, or skipped the points known from previous iterations, which gave rise to 45 simulations/algorithm, and 15 for DICs.
As can be seen, all the methods find the local minimum but differ in their ability to reproduce the result in another simulation. This results from the relative difference between the ends of the ranges along consecutive iterations. Figure 10 shows that the EQL method preserves a large difference between two configurations of the controller parameters through all iterations. This enables the algorithm to gather the cost function in sufficiently varying work conditions, supporting a good decision on how to bound the new range. In DIC1, however, the range decreases rapidly, which causes the controller parameters to be found almost randomly (see Fig. 5d). When the difference between the ends of a range is moderate (see DIC2 or FIB), the tuned parameters are much closer to the true minimum point which is especially important in experiments. When the difference is small, and external disturbances affect the measurements, gathering performance indices for two gains that are too close together may result in performance deterioration of the tuning method, making the gains the same from the performance viewpoint. This was the main motivation for validating the performance of the DIC method in two configurations.
If we were to consider altering the tolerance , the resulting change in the required number of iterations N is depicted in Fig. 11. As can be seen, N grows rapidly for the EQL and DIC2 methods, which proved to have the best performance here. This is also strongly related to Fig. 10, where the methods for ensuring a fair distance between the selected gain pairs use measurements sufficiently different from each other, in order to make the pairs distinguishable. This is the contrast with the results presented in Fig. 11, where the methods require a greater number of iterations to be performed.
In addition, a series of 10 tuning simulations was conducted with the UAV and EQL algorithm in two configurations, namely: with a cylindrical load of 0.5 kg, diameter 0.3 m, height 0.1 m attached to the UAV, and without the load. In the first case, the gains averaged over 10 simulations were: k P = 10.8, k D = 5.16, whereas in the second case, k P = 9, k D = 3.72. To show the impact of proper tuning on the behaviour of a UAV, 4 long-horizon simulations were conducted with 100 repetitions of the reference primitive, as summarized in Table 3.

AL-TUNE: Experimental Results
The proposed methods were verified using experiments carried out on the same UAV (Fig. 6). The cost function was built as the sum of the absolute tracking error samples, the ranges of the parameters, and all other parameters were the same as for the simulations. A low-pass filtered tracking error was used with the M trajectory primitive with 12 sec length, changing in the range 2 ÷ 3 m. Each experiment had the following stages: arm the UAV, reach the desired position and altitude of 3 m, tune, and land. The experiments were conducted in fully autonomous mode, with the ground true taken from the differential GPS. The results presented here come from the tuning stage. In all the cases, the initial value of k D was set to 9.  The experiments were conducted in windy conditions. Variable wind gusts are the most problematic environmental parameter for UAV stability. The experiments and also numerous other deployments of the system in various environmental conditions [30,40,43] have shown that the selected SO3 controller (1), and (2) with properly-tuned parameters provides sufficient stability. The rates of pitch, roll and yaw, and also the altitude, are controlled by the PixHawk embedded stabilizer. The special orthogonal group SO3 controller described briefly in Eqs. 1, and 2 was implemented on the NUC-i7 PC on-board computer, and supplies PixHawk (PX4), using internally PID controllers, with the altitude reference. The controllers of all the axes are fully-decoupled. The control inputs, i.e., the rates, for the altitude rate controller are orthogonal and do not influence each other. The outer-loop controller based on SO3 ensures control in position, velocity and acceleration, and its gains are tuned here by AL-TUNE. Figure 7 presents the controller gains during tuning. As can be seen, FIB, GLD and DIC2 terminate earlier, whereas EQL and DIC1 terminate after 672 sec. The differences discussed in the section dealing with simulations can also be observed here. The final gains are approx. in the same area, as the minimum cost function area from Fig. 5a-e. Figure 8a presents the tracking errors. It can be seen that, due to proper tuning, the absolute tracking error converges to zero. The large tracking error values are due to wind gusts, which did not cause the tuning to go wrong. Figure 8b, and c present the difference in tracking quality  on the basis of the DIC1 results for poorly-tuned and welltuned parameters that make a difference in altitude control, in order to depict the need for good tuning of the controller parameters, adapting the gains to the mission requirements (i.e., the required speed of the transients). For example, DIC1 can be generalized to other methods from the family. In order to depict the behaviour of the UAV during the tuning procedure, a sample case for EQL tuning has been presented in Fig. 9, to show changes in roll, pitch, and yaw angles. Slight changes in the angles are given rise by environmental disturbances, such as wind gusts, and do not impede the tuning process at large. Table 4 gives a statistical comparison of simulation vs. experimental results, presenting the results of a single experimental run for each method, and the series of simulations from Figs. 5a-e. As can be seen, the best results are obtained by the FIB and EQL methods, which yield controller gains close to the experimental results and the simulation results. However, the DIC and GLD methods (see Fig. 10) offer contraction ratios that provide indistinguishable gains in the initial iterations, making the tuning process a random walk. It therefore cannot be applied in the cases discussed here, i.e. when consecutive values of the cost function differ only very slightly (Fig. 11).
As a result of the extensive tests of the AL-TUNE scheme, it has been found that the best results are obtained by methods that preserve the widest possible ranges for the sought parameters during all iterations (see Fig. 10), and that at the same time have the least standard deviation between the final points obtained from different simulation runs, namely EQL, and secondly DIC2. Very wide ranges for the parameters make the controller gains selected between the iterations distinguishable, despite any disturbances, wind gusts or noisy altitude measurements, and reduce their impact on the performance of the methods.

Conclusions
This paper has presented a family of new real-time autotuning methods for controllers based on a zero-order optimization algorithm. The following advantages of the method can be listed on the basis of the experiments reported here: -for defined feasible ranges, the methods can be used for optimization-based tuning of controllers, -a suitable method can be selected, depending on the level of the external forces acting on a UAV (e.g. wind gusts), by changing the controller gains during the experiment sufficiently to enable a comparison of the cost function, to carry on the optimization; in the end, the tuning method considerably improves the tracking performance, -the proposed methods are fast, and do not require huge memory. In addition, they are very versatile and can be used to tune low-level controllers and also high-level controllers, Fig. 11 Minimum number of iterations N vs. tolerance -a single bootstrap duration time can be greatly reduced for fast-dynamics UAVs, or by omitting gain configurations that have already been checked in prior iterations in the tuning algorithm, -for well-defined initial ranges, the proposed method results in a search for the optimum in the domain of all stabilizing controllers. This is because the predefined ranges define, in general, the interior of the closed-loop stability region. The method therefore operates safely, eliminating potential threats for operators.
The proposed approach enables rapid tuning, and unlike previously-mentioned tuning techniques, requires virtually no prior knowledge from a model of a UAV. The versatility of the family of tuning methods can be used to tune controller parameters from scratch, to adapt them for various work regimes for various loads, and even for fine-tuning existing controllers, pushing them to their limits performance limits.

Future Work
When a controller working in a control system has already been tuned (e.g. with on of the methods presented in the introduction), and the dynamic properties of the plant change, because loads are gripped or are released, because operating conditions change, or because the model being taken into account results in poor control quality, the proposed approach can find new superior controller parameters, e.g., in the vicinity of gains used in prior, by defining ranges to include the prior set of admissible parameters. It can potentially be used in various tasks, especially for applying tuning in aerial manipulation, for grasping, and for perching, when the change in the center of mass has a radical impact on control performance, what could be an interesting area for future research.
The introduced family of novel tuning methods is planned to be extended to the case of using a pair of UAVs, what has been already done to some extent, see [18], to further reduce the running time of the algorithm, as well as to increase the resilience of the method against environmental disturbances. Secondly, the methods should be considered with respect to increase the efficiency of the UAV deployment in a last-centimeter delivery systems operating in the vicinity of humans, see [23], or to increase the precision of cooperation of swarms of UAVs (as MBZIRC 2023-related issues, see [22]).

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.