# Fault compensation by online updating of genetic algorithm-selected neural network model for model predictive control

**Part of the following topical collections:**

## Abstract

This paper presents a methodology that combines a dual-net model and the model predictive control (MPC) to compensate degraded system performance caused by slow-paced faults/anomalies. The dual-net model is comprised of an offline and an online artificial neural networks (ANNs) along with a switch that selects one of them for MPC. Through selective online updating of weight parameters, the online ANN is able to accurately capture the fault-induced variations in system dynamics, and can be used for MPC reconfiguration and fault compensation. Specifically, the system dynamics is identified by training a multilayer perceptron (MLP). To improve the model accuracy, a meta-optimization approach based on the genetic algorithm is applied to optimize the MLP hyperparameters and the training algorithm. A dual-thread decision maker is proposed to manage the robust model updating scheme and the dual-net model switch. A case study of numerical simulation using an unmanned quadrotor is undertaken to verify the feasibility of the proposed method to mitigate performance degradation. Salient performance in the response prediction and control, subject to gradually growing anomaly is successfully demonstrated. Quantitatively, the proposed updating model outperforms the offline ANN model and yields 2× and 4× lower errors, respectively, for prediction and control of the system response.

## Keywords

Neural network Meta-optimization Model predictive control Anomaly mitigation## 1 Introduction

Machine learning techniques, in particular, artificial neural networks (ANNs), have emerged as effective and popular approaches to identify complex behavior in nonlinear systems in the past three decades and enable accurate and robust control [1, 2, 3]. This is because it is often difficult to construct accurate, physics-based, control-oriented models due to the complexity and unknown dynamics of the systems. Thus, the use of the ANN can overcome these issues and allows to capture the nonlinear behavior and develop high-quality control strategies in the form of a set of predefined mathematical structures [4].

Among different classes of ANN control systems, ANN-based model predictive control (MPC) has garnered significant interest due to its salient applicability to nonlinear model predictive control (NMPC) applications. ANNs serve as system models to forecast future dynamic behaviors, and these predictions then can be utilized by the controller to determine the optimal control inputs that minimize the predefined cost function. One of the key requirements of NMPC is an accurate system model of a simple mathematical structure to represent the nonlinear system behavior. Therefore, ANNs are highly desirable methods for identifying NMPC-compatible models. Moreover, MPC is able to easily incorporate input, output, and state constraints, which makes it a popular approach for many practical systems. ANN-based MPC has found broad uses in real-world applications, including a water level regulation of a tank unit, a piezoelectric actuator, and a stirred tank reactor [5, 6, 7]. They all utilize the most commonly used ANNs as the main architecture for system modeling, such as the multilayer perceptron (MLP) and radial basis function network (RBFN). An MLP consists of at least three layers of neurons (or nodes), where each neuron uses an activation function except for the input nodes. A RBFN is similar to MLP but neurons utilize radial basis functions as activation functions. Furthermore, the recurrent neural network (RNN) is also a widely used ANN structure for MPC in both system identification and control [8, 9]. For example, Han et al. [10] proposed a self-organizing RNN to control the dissolved oxygen concentration in a wastewater treatment process. An RNN has a unique structure where the outputs from the previous step are fed into the current step, which makes it a good approximator for time series data. In addition, a fully connected cascade (FCC) network, which has direct connections from all input neurons to all output and hidden neurons, was applied by Negri et al. [11] in MPC for pressure control of a water tank.

Albeit, various ANNs are capable of improving computational efficiency and control performance, ANN-based MPCs suffer from the difficulty to compensate disturbances [12]. Mild disturbances caused by mismodeling and environment may be moderated by the receding horizon technique. Nevertheless, actual mechanical systems all undergo slow paced degradation, arising from wearing, tearing, corrosion, minor damage, and failure. This will cause the behavior drift of the actual system from the ANN model identified from the normal operation, leading to the deviation of the model prediction from the actual responses read from the sensors. Consequently, the performance of the MPC synthesized for the nominal (original) system will be compromised, causing a nonzero steady state tracking error also known as an offset. Offset-free tracking for MPC has been accomplished by disturbance modeling and observer design by various groups [13, 14, 15]. Another approach in dealing with such disturbances is to employ an adaptive ANN model [16, 17, 18]. The adaptive ANN model updates its weights (or even a structure) in real-time with the online collected data. When the system dynamics is shifted, changes in the system are projected into the data, enabling the ANN to learn and capture the new dynamics. The foremost merit of this method is that no a priori information of the disturbance is required. Nonetheless, it also has several distinct disadvantages. First, often the adaptive ANN updates its weights at every time step (or in every few steps), even when it is not necessary, which is computationally inefficient and makes it susceptible to noise when no anomaly is present. Second, usually to enable the online training, the selected ANN structure is too small to represent the actual system for a wide range of inputs and outputs. Lastly, there is limited measure to effectively circumvent overfitting or other training issues when the model is updated recursively, which may be a serious issue for safety–critical systems, such as vehicles and power plants.

In order to address aforementioned limitations, in this paper we present a robust and feasible ANN-based MPC methodology to maintain generalized accurate model representation and efficiently compensate for degraded performance due to dynamic behavior shifts that are mostly caused by slow-paced anomalies of actual mechanical systems, such as wearing, tearing, fatigue, corrosion, and etc. It should be noted that rapid, abrupt system faults or failures are not the focus of consideration in the present study. The framework includes three key components: ANN meta-optimization, dual-net model, and MPC. In our methodology, the system dynamics is described by an ANN-based plant model in the NARMAX formulation, for which the MPC will be developed. Therefore, the first step is to determine the ANN architecture for salient model performance, which can be achieved though the meta-optimization methods. ANN meta-optimization using evolutionary algorithms can be found in various literatures [19, 20, 21, 22]. In this effort the ANN topology of the plant model is optimized using the genetic algorithm (GA) to select the most appropriate values of the time window size of the input and output delays in the NARMAX formulation and the size of the hidden layer that minimizes the training and validation error. The concept of the dual-net model has been originally introduced by Puttige and Anavatti [23]. Specifically, both offline and online ANN models in the form of multilayer perceptrons (MLPs) for the desired system are trained. Then the two ANNs are connected in parallel with a switch to select the one that predicts actual system response more accurately during the previous epochs. It is shown that the dual-net model is able to outperform the individual model in quantitative prediction. In this study, the dual-net model is built on the optimal MLP structure determined by the GA above. First the offline ANN model is trained beforehand in conjunction with the meta-optimization using a set of data of great diversity collected during previous nominal operations. The offline ANN remains unchanged throughout the entire period of the current operation and serves two purposes: (i) to identify the extent of the deviation in system dynamics as a result of the slow-paced anomalies above to inform the users; and (ii) to be utilized by the MPC whenever the online ANN is not ready for use, such as poor prediction due to overfitting or other training issues. Once the offline ANN becomes available, the online ANN model is initialized as a duplicate copy of the offline ANN. During the operation, the online ANN is updated whenever the system model prediction deviates notably from the actual system response due to anomalies. Then the MPC is used to design optimal control trajectories given the prescribed cost function and constraints in the presence of operational anomalies. Because of system degradation, its dynamics and responses read from sensors will deviate from those of the nominal system, and the MPC synthesized for the offline nominal system will exhibit steady state error even if it satisfies the control criteria. To tackle the issue, a dual-thread decision maker is proposed to manage the model updating and switch within the dual-net model and coordinate its prediction with the MPC to compensate the degraded system performance.

It should be pointed out that online ANN updating to mitigate the disturbance in the ANN-based MPC is established in literatures. Therefore, combining ANN and MPC is not the focus of the present study. Our objective is to combine the dual-net model and MPC and coordinate them in an organized manner during online operations to establish a deployable framework that safely maintains the performance level when the system is undergoing degradation. There are several novelties in the present effort that distinguish it from the existing work, including the use of GA-optimized ANN model in MPC, which to the best of our knowledge, has not been adequately investigated. The dual-thread decision maker to manage the online ANN updating and coordinate the dual-net model in MPC is also proposed, which serves as the cornerstone to organize the entire process for enhanced robustness and efficiency. Lastly, through the case study of numerical simulation, the feasibility of compensating degraded system performance in the presence of gradual anomalies by integrating the above key components is verified. The salient improvement in ANN prediction and control performance obtained through the proposed methodology relative to the non-updating benchmark is also demonstrated and quantitatively characterized.

This paper is organized as follows. In Sect. 2, the proposed framework/methodology of online ANN-based MPC, including GA-based meta-optimization, dual-net model, MPC, and dual-thread decision maker is described. In Sect. 3, a case study of controlling an unmanned quadrotor and the procedure of how to implement the anomaly is explained. The results of system identification and modeling using the GA, and prediction and control performance of the proposed methodology are discussed in Sect. 4. Finally, Sect. 5 concludes the paper with a summary of achievements and future work.

## 2 Online updated artificial neural network and model predictive control

### 2.1 Methodology

The structure of the dual-net model is shown in Fig. 1b, which includes two ANNs: online and offline, connected in parallel to a switch in the work flow. The offline ANN trained beforehand remains unchanged throughout the operation. When the system is in a normal status, decision maker selects the *off* branch in the switch to utilize the offline ANN for MPC configuration. On the other hand, when the system anomaly occurs and causes the deviation of the model-predicted response from the actual system response, the online ANN is re-trained during operation with the accumulated data to accurately capture the latest system dynamics. One potential issue of the proposed methodology is that the online re-training of the network utilizing the biased training data may be overfitted, leading to poor control performance. Therefore, the decision maker decides in situ which model is better and should be used along with MPC by assessing the prediction accuracy of both models in the presence of anomalies. If the online updated ANN outperforms the offline ANN in prediction, the decision maker selects the *on* branch in the switch to use the former for MPC.

The model predictive control (MPC) module uses the dual-net model to generate a sequence of control signals for actuators at the desirable interval that drive the system to follow the reference signal and to mitigate adverse effects arising from the anomaly. Our MPC is based on the receding horizon technique, and its cost function considers the model-predicted response relative to the reference and the temporal variations of the control signals over a specified time horizon. A numerical optimization program is then harnessed to determine the control inputs that minimize a performance criterion over the horizon. The detailed description of each component in Fig. 1 will be presented in the following sections.

### 2.2 Genetic algorithm-guided neural network modeling

*u*(

*k*) and

*y*(

*k*) are respectively, the input and response at the current time step,

*n*

_{u}and

*n*

_{y}are input and output delays, respectively, and

*F*(

*·*) is a nonlinear function that quantitatively describes the NARMAX relationship and can be determined using available input-response data. Equation (1) clearly shows that the response

*y*(

*k*) at the current time step depends on its historical values and the current and previous inputs. In this paper, the artificial neural network (ANN) is used, which is one of the most widely used data-driven modeling approach to approximate the nonlinear function

*F*.

*W*

^{(1)}and

*W*

^{(2)}are the input-to-hidden and hidden-to-output weight matrices, respectively. The hyperbolic tangent function is used as an activation function of the hidden layer herein. Given a dataset from a physical plant, constructing a high-quality ANN model is not straightforward. Due to the large number of model hyperparameters and data configurations to proceed ANN training, determining the optimal model inputs and the MLP structure for a given task using the

*trial and error*approach can be time-consuming and tedious. Therefore, automated search of hyperparameters within a broad range is carried out to achieve the optimal MLP model for enhanced model accuracy that otherwise is not available through manual selection. Table 1 shows the hyperparameters of the MLP model that are identified for automated tuning, and the corresponding search range. For both the input and output delays, the lower and upper limits are chosen to be 1 and 30, respectively, and the number of neurons in the hidden layer is selected within the range of 1–50. Larger delays and hidden neurons usually improve the training accuracy but at the cost of increased size and complexity of an ANN. As a result it will require more training time and resource usage. Note that if optimal values within the search space are selected close to the upper limits, then the range must be extended to allow more freedom to the search. In addition to the size of the MLP, twelve different training algorithms are explored, including gradient descent (GD), Levenberg–Marquardt (LM), Bayesian regularization (BR), BFGS quasi-Newton, and others. Indeed, the accuracy of the training algorithm during ANN training heavily depends on the specific data set, e.g., the noise level.

Hyperparameters of interest and their types and range for the ANN model

Hyperparameter | Type | Parameter range |
---|---|---|

Input delay | Integer | [1–30] |

Output delay | Integer | [1–30] |

Hidden layer neurons | Integer | [1–50] |

Training algorithm | Integer (list index) | [1–12] |

### 2.3 Model predictive control

*y*

_{n}) over a specified time horizon from the dual-net model and the reference response (

*y*

_{r}) as inputs, and generates the control signals over another time horizon determined by a numerical optimization problem that minimizes the following performance criterion, viz., cost function over the specified horizon:

*N*

_{1}is the minimum costing horizon;

*N*

_{2}is the maximum costing horizon;

*N*

_{u}is the control horizon, and

*y*

_{r}is the reference input,

*y*

_{n}is the predicted ANN output, and

*ρ*is a control weighing factor. According to Eq. (3), the cost function includes not only the mean squared error (MSE) between the reference response and the ANN predictions, but also the changes in the control signal

*u*as a penalty term, where \(\rho > 0\) is the penalty parameter. Therefore,

*ρ*decides how much the change in control input is allowed. Larger

*N*

_{2}and

*N*

_{u}will improve the control performance, but it will increase the computational load during both the offline and the online stage. The goal of MPC is to compute \([u(k + 1), \ldots ,u(k + N_{u} )]\) by minimizing Eq. (3) for every control epoch. For our simulation study,

*N*

_{1},

*N*

_{2},

*N*

_{u}and

*ρ*are selected empirically, which yield consistent and reliable performance in this work. Selecting these control parameters is not critical since they do not have impact on the steady state error caused by the disturbance due to the dynamic shifts of the system as studied herein. The stability of ANN-based MPC is proved by the Lyapunov synthesis method in literatures [26, 27]. Most widely used algorithms to solve this type of optimization problem are Newton, quasi-Newton and Levenberg–Marquardt related methods. In this paper, a bounded BFGS quasi-Newton method is adopted because of its computational efficiency and reliability.

### 2.4 Dual-thread decision maker

*E*

_{off}and

*E*

_{on}refer to the mean squared prediction errors of the offline and online ANNs (or MLPs) relative to the actual system response, respectively, for a specified time window.

The first logic thread, on the left, is for the anomaly detection and model updating. There are a variety of anomaly detection approaches, including clustering, nearest neighbors, statistical, subspace, classifier and others [28, 29], and their applications have been reported in numerous systems [30, 31, 32, 33]. Nevertheless, finding the optimal one is out of scope of this research as our primary focus is to investigate the feasibility of the methodology to maintain system performance by updating the ANN model and MPC design during the operation using online data, especially when the plant experiences a slow-paced degradation or drift in system dynamics. Therefore, we employ the out-of-limits (OOL) approach, which is the most widely used method [34]. OOL simply uses predefined threshold values, denoted as *τ* in Fig. 4, and alerts whenever the difference between the sensor and the predicted data exceeds the threshold value.

The first logic is divided into two stages, respectively, comparing *E*_{off} and *E*_{on} against the threshold value τ in the given order. Recall that the offline ANN models the dynamics of the original, nominal system. Therefore, when the criterion *E*_{off} > *τ* is true, it indicates the presence of the anomaly. The second criterion of Logic 1, i.e., *E*_{on} > *τ* is used to determine necessity of updating the online ANN model. If the online ANN has already been updated and allows accurate prediction, making *E*_{on} > *τ* false, then the online ANN updating becomes unnecessary (no update will be performed). On the other hand, if there are continuously growing anomalies or training failures that cause *E*_{on} > *τ* to be true, then the online ANN will be updated again until the prediction error reaches below the threshold. Several points should be noted: the online ANN is initialized with a copy of the offline ANN, that is, initially, the weights of the online ANN are the same as those of offline ANN. This will actually reduce the training time since the training process will not begin with randomly assigned weights, and the weights of offline ANN are essentially a good starting point for the online ANN updating. Moreover, when the anomaly is detected for the first time, the online ANN will also be updated immediately because of *E*_{on} = *E*_{off} > *τ*.

Although updated, the online ANN does not guarantee to be utilized by the MPC. This is because online ANN training is highly susceptible to the overfitting and other related issues that may provide inaccurate prediction. If used by the MPC, it may deteriorate system performance or even lead to system failure. The second logic thread (on the right from the figure) and the switch in the dual-net model are introduced to prevent this issue. That is, the second logic thread compares the accuracies of the offline and the updated ANNs when predicting the actual plant responses for a specified time window, and then decides the action of switch. If the updated ANN outperforms the offline ANN, i.e., *E*_{on} < *E*_{off}, then the switch is turned *on* and the updated ANN is used in MPC to compute the control actions and vice versa (as shown in Fig. 1b). Throughout the entire process both logic threads operate independently at every time step to determine when to update the online ANN and which model to use for MPC reconfiguration.

### 2.5 Dealing with overfitting for online training

Updating the ANN using the online data is a formidable task mainly due to the overfitting issue. This is because the range and the diversity of the online operational data is usually extremely limited. For example, if the objective of an unmanned aerial vehicle (UAV) during the operation is to maintain its flight at a certain altitude, then the data collected online from the UAV will have a small range in altitude variation. This is critical in ANN training since the foundation of ANN is to train a generalized model using a wide range of data. Moreover, once the controller is active in the closed loop, the operators no longer have direct manipulation on the control inputs applied to the physical plant. In other words, we can only provide reference values that the controller will strive to meet. Therefore, when the online data is accumulated with the controller in the loop, the data will depend on the control scheme. For instance, the control weighing factor (*ρ*) in MPC introduced in Sect. 2.3 restricts the changes in the input, and eventually restrains the diversity of data.

Accordingly, actions are required to prevent the ANN from overfitting. The effects of the data volume used for online ANN update is first investigated. We start with data sets of small sizes, and eventually find through trials that increasing the data volume reduces the overfitting effect, and the larger data volume is favorable to creating more generalized ANN models. However, an inordinately large data volume could significantly increase the data accumulation time and slow down the response rate of mitigating the disturbance. For this particular work, we decide to use 2 h of data accumulated online to update the ANN. The data volume will vary depending on the systems, disturbances and objectives.

Other than increasing the data volume, the technique of early stopping for ANN training is also used with the stronger condition. Early stopping is a way to terminate ANN training when the performance error of the validation set begins to grow while the performance error of the training set continues to decrease. This means that the ANN is being overfitted to the training set, and the network is losing its generality. Whenever this event occurs, the training algorithm counts the number of occurrences. Once the number exceeds the predefined value, the training stops. Stronger condition refers to reducing this predefined value and increasing the ratio of the validation set with respect to the training set.

Lastly, another means we take to mitigate the overfitting issue during online training is to apply Bayesian regularization (BR) as the training algorithm. Although Levenberg–Marquardt (LM) is found to be the excellent algorithm for the offline ANN modeling, due to its fast convergence rate, there is a large chance of being overfitted for the online use (see Sect. 4.1 below). In other words, LM allows more rapid changes in the weights than the BR within one iteration. Applying early stopping in conjunction with BR is found to be a good option for the present work.

## 3 Case study and numerical experiment

To verify the concept of the online ANN updating and degraded performance compensation, an unmanned quadrotor system is chosen to represent the actual plant, which has a well-known, physics-based mathematical model that is easily accessible. MPC for unmanned quadrotors has been demonstrated recently by several groups [35, 36, 37, 38]. Zhang et al. [27] recently proposed ANN-based MPC for formation flight of multiple unmanned quadrotors, which uses RNN to update the weight parameters at every time step that is different from our approach based on the dual-net model.

### 3.1 Plant model

_{1}, Ω

_{2}, Ω

_{3}, Ω

_{4}) are angular velocities of each rotor. The full equations of motion are given by [39],

*x*,

*y*,

*z*) and (

*θ*,

*ϕ*,

*ψ*) represent translational and rotational motions in the body fixed coordinate system, respectively; (

*I*

_{xx},

*I*

_{yy},

*I*

_{zz}) are the area moment of inertias about each body frame axis; (

*u*

_{1},

*u*

_{2},

*u*

_{3},

*u*

_{4}) are the inputs that create motions in the directions of (

*z*,

*θ*,

*ϕ*,

*ψ*), respectively; Ω

_{r}is the relative speed of rotors;

*J*

_{r}is the rotor’s inertia;

*l*and

*m*are the arm length and the total mass of the quadrotor, respectively; and

*g*is the gravity. Moreover, inputs (

*u*

_{1},

*u*

_{2},

*u*

_{3},

*u*

_{4}) are computed by multiplying the transformation matrix as shown in Eq. (5) [39].

*K*

_{f}and

*K*

_{m}are the aerodynamic force and moment constants, respectively. The actual system has 4 inputs with 3 translational and 3 rotational motions. In order to simplify the problem to verify the feasibility of our methodology, only the yaw angle (

*ψ*) and the altitude (

*z*) of the quadrotor are considered in this work.

### 3.2 Offline model training and reference signal

For system identification, two separate offline MLPs are trained with the model structure described in Sect. 2.2, each representing a multi-input single-output (MISO) system. In other words, two MLPs are trained to predict the yaw angle and altitude, separately. This is a more suitable approach than training a single ANN that represents a multi-input multi-output (MIMO) system, because the model accuracy can be compromised if two totally different motions are modeled from the same set of weights. Also if both ANNs are not separated, they need to be updated simultaneously for both the yaw angle and the altitude. On the other hand, if the ANN is only responsible for estimating a single state, then the ANN updates can be performed independently. Again, the online ANN model is initialized as a copy of the offline ANN model at the beginning of the simulation.

### 3.3 Anomaly

*F*

_{i}is the aerodynamic force produced by rotor

*i*,

*M*

_{i}is the aerodynamic moment produced by rotor

*i*,

*ρ*is the air density,

*A*is the blade area,

*C*

_{T}and

*C*

_{D}are aerodynamic coefficients,

*r*is the radius of blades. In this work,

*K*

_{f}and

*K*

_{m}are altered continuously as a prescribed function during the first few hours of the simulation to mimic the slow-paced blade degradation arising from deformation, wearing, and yielding. Figure 7 shows the magnitude of the modified aerodynamic constants of all four rotors with respect to time. The aerodynamic force constants are assumed to be equal for all four rotors to avoid any pitch and roll motions. The aerodynamic moment constants are assumed to be equal for rotors in pairs: (1, 3) and (2, 4) for the same reason. These changes will introduce disturbances in system models, leading to steady state errors.

## 4 Results and discussion

### 4.1 Artificial neural network hyperparameter selection

_{1}= Ω

_{3}and Ω

_{2}= Ω

_{4}are imposed to allow the system to vary only in the yaw angle and the altitude while keeping the pitch and the roll axis fixed. The input data is created by a series of random step functions with periods chosen randomly between 0.1 and 2 s. The yaw angle is limited to 5 revolutions and the maximum operational range in altitude is set to be ± 100 meters. The Gaussian noise is added to the output data, and the magnitudes of the noise are 0.01 rad and 0.1 m for yaw angle and altitude, respectively.

Selected hyperparameters for ANN model by the genetic algorithm

Input delay | Output delay | Hidden neuron | Train algorithm | |
---|---|---|---|---|

Run 1 | 12 | 21 | 36 | LM |

Run 2 | 10 | 18 | 26 | LM |

Final choice | 10 | 20 | 36 | LM |

### 4.2 Anomaly compensation by ANN-based MPC

Comparison of control and prediction errors between the offline ANN and the dual-net model

Offline ANN | Dual-net | ||
---|---|---|---|

Yaw angle | Control error (°) | 11.89 | 2.27 |

Prediction error (°) | 8.01 | 1.81 | |

Altitude | Control error (m) | 1.115 | 0.426 |

Prediction error (m) | 0.375 | 0.159 |

## 5 Conclusion

A methodology is proposed to integrate the dual-net model, which consists of the offline and online ANNs, and the model predictive control (MPC) to compensate for the degraded performance caused by slow-paced, continuously growing anomalies in mechanical systems. The foremost novelty lies in the combination of the dual-net model with MPC and the dual-thread decision maker to independently determine and organize the online ANN model updating and the model switch for MPC. The new elements proposed will improve the online learning/updating efficiency and ANN model robustness, and hence, opening up new possibilities to realize operational autonomy for mechanical systems with anomalies on the computing resource-limited platform.

The ANN system identification/modeling based on the MLP is used to construct the offline baseline model, and further improved in prediction accuracy by the GA to select the optimal network structure and hyperparameters, including the time window size for input and output delays and the hidden layer size, and also the training algorithm. Such an optimized MLP is used to initialize another copy of the online ANN model, which along with the offline ANN model forms the aforementioned dual-net model and will be updated online as necessary. The dual-net model is then combined with the MPC for online synthesis of control actions to be applied to the physical plant. Under the dual-thread decision maker framework, new ANN updating and switch schemes for MPC are proposed. That is, when the ANN prediction accuracy is worse than the prescribed threshold value, the system is triggered to accumulate the operational data for a specified period of time followed by online ANN training using the accumulated data, in which the structure of the online ANN remains unchanged and only the weights are updated. Both the offline and the online ANNs run in parallel throughout the simulation and are compared with the actual plant response, and the one exhibiting better prediction accuracy is selected for MPC prediction in the next horizon. Finally, the case study of the unmanned quadrotor model is undertaken to verify the proposed methodology through numerical simulation. The dual-thread decision maker and the dual-net model demonstrate salient performance in both the accuracy of predicting the actual plant response and the quality of system control subject to growing anomaly. In summary, the updated ANN-based MPC outperforms that solely based on the offline ANN in the presence of anomaly as manifested quantitatively by 4× and 2× reduction in the control and the prediction error. The results verify the feasibility of compensating the degraded performance caused by the shifts in system dynamics.

Despite salient results, there are several limitations in the current method for ANN updating. The authors implemented several techniques to address the overfitting issue. However, there are still few spikes of errors remaining in the predicted results. This will be more critical when the online data is even less diverse. Therefore, switching model for MPC synthesis is used as an additional means of security to ensure desirable MPC performance and stable operation. The root of the overfitting issue is attributed to the large number of fitting parameters (about 1500) during online training. Nonetheless, most of the anomaly scenarios (e.g., loose-fitting, wearing, fatigue, and other) occur in a gradual manner and cause incremental variations in system dynamics. Therefore, updating the entire network weights each time may be unnecessary in terms of both resource usage and model quality. The future research will focus on further investigating and mitigating these issues.

## Notes

### Acknowledgements

This research was sponsored by the DoD/Army Research Laboratory (ARL) under the Contract Number W911QX-18-P-0180. The authors would like to acknowledge Mr. Eric Mark at the ARL for his support and feedback on the present work.

### Compliance with ethical standards

### Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

## References

- 1.Narendra KS, Parthasarathy K (1990) Identification and control of dynamical systems using neural networks. IEEE Trans Neural Netw 1(1):4–27CrossRefGoogle Scholar
- 2.Hagan MT, Demuth HB, Jesús OD (2002) An introduction to the use of neural networks in control systems. Int J Robust and Nonlinear Control IFAC-Affil J 12(11):959–985CrossRefGoogle Scholar
- 3.Mohammadzaheri M, Chen L, Grainger S (2012) A critical review of the most popular types of neuro control. Asian J Control 14(1):1–11MathSciNetCrossRefGoogle Scholar
- 4.Draeger A, Engell S, Ranke H (1995) Model predictive control using neural networks. IEEE Control Syst 15(5):61–66CrossRefGoogle Scholar
- 5.Patan K, Korbicz J (2012) Nonlinear model predictive control of a boiler unit: a fault tolerant control study. Int J Appl Math Comput Sci 22(1):225–237CrossRefGoogle Scholar
- 6.Cheng L, Liu W, Hou ZG, Yu J, Tan M (2015) Neural-network-based nonlinear model predictive control for piezoelectric actuators. IEEE Trans Ind Electron 62(12):7717–7727CrossRefGoogle Scholar
- 7.Wang T, Gao H, Qiu J (2016) A combined adaptive neural network and nonlinear model predictive control for multirate networked industrial process control. IEEE Trans Neural Netw Learn Syst 27(2):416–425MathSciNetCrossRefGoogle Scholar
- 8.Akpan VA, Hassapis GD (2011) Nonlinear model identification and adaptive model predictive control using neural networks. ISA Trans 50(2):177–194CrossRefGoogle Scholar
- 9.Pan Y, Wang J (2012) Model predictive control of unknown nonlinear dynamical systems based on recurrent neural networks. IEEE Trans Ind Electron 59(8):3089–3101MathSciNetCrossRefGoogle Scholar
- 10.Han HG, Zhang L, Hou Y, Qiao JF (2016) Nonlinear model predictive control based on a self-organizing recurrent neural network. IEEE Trans Neural Netw Learn Syst 27(2):402–415MathSciNetCrossRefGoogle Scholar
- 11.Negri GH, Cavalca MS, de Oliveira J, Araújo CJ, Celiberto LA (2017) Evaluation of nonlinear model-based predictive control approaches using derivative-free optimization and FCC neural networks. J Control Autom Electr Syst 28(5):623–634CrossRefGoogle Scholar
- 12.Yan Z, Wang J (2014) Robust model predictive control of nonlinear systems with unmodeled dynamics and bounded uncertainties based on neural networks. IEEE Trans Neural Netw Learn Syst 25(3):457–469MathSciNetCrossRefGoogle Scholar
- 13.Morari M, Maeder U (2012) Nonlinear offset-free model predictive control. Automatica 48(9):2059–2067MathSciNetCrossRefGoogle Scholar
- 14.Tatjewski P (2014) Disturbance modeling and state estimation for offset-free predictive control with state-space process models. Int J Appl Math Comput Sci 24(2):313–323MathSciNetCrossRefGoogle Scholar
- 15.Sena HJ, Ramos VS, Silva FV, Fileti AMF (2017) Adaptive offset remover based on Kalman filter integrated to a model predictive controller. Chem Eng 57:1093–1098Google Scholar
- 16.Alexandridis A, Sarimveis H (2005) Nonlinear adaptive model predictive control based on self-correcting neural network models. AIChE J 51(9):2495–2506CrossRefGoogle Scholar
- 17.Kusiak A, Xu G (2012) Modeling and optimization of HVAC systems using a dynamic neural network. Energy 42(1):241–250CrossRefGoogle Scholar
- 18.Vatankhah B, Farrokhi M (2018) Nonlinear adaptive model predictive control of constrained systems with offset-free tracking behavior. Asian J Control 21(5):2232–2244CrossRefGoogle Scholar
- 19.Lam HK, Ling SH, Leung FH, Tam PKS (2001) Tuning of the structure and parameters of neural network using an improved genetic algorithm. In: The 27th annual conference of the IEEE Industrial Electronics Society, 2001. IECON’01, vol 1, pp 25–30. IEEEGoogle Scholar
- 20.Curteanu S, Cartwright H (2011) Neural networks applied in chemistry. I. Determination of the optimal topology of multilayer perceptron neural networks. J Chemom 25(10):527–549CrossRefGoogle Scholar
- 21.Zhang R, Tao J, Gao F (2016) A new approach of Takagi-Sugeno fuzzy modeling using an improved genetic algorithm optimization for oxygen content in a coke furnace. Ind Eng Chem Res 55(22):6465–6474CrossRefGoogle Scholar
- 22.Zhang R, Tao J (2016) Data-driven modeling using improved multi-objective optimization based neural network for coke furnace system. IEEE Trans Ind Electron 64(4):3147–3155MathSciNetCrossRefGoogle Scholar
- 23.Puttige VR, Anavatti SG (2008) Real-time system identification of unmanned aerial vehicles: a multi-network approach. JCP 3(7):31–38Google Scholar
- 24.Tian R, Yang Y, van der Helm FC, Dewald J (2018) A novel approach for modeling neural responses to joint perturbations using the NARMAX method and a hierarchical neural network. Front Comput Neurosci 12:96CrossRefGoogle Scholar
- 25.Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4(2):65–85CrossRefGoogle Scholar
- 26.Patan K (2015) Neural network-based model predictive control: fault tolerance and stability. IEEE Trans Control Syst Technol 23(3):1147–1155CrossRefGoogle Scholar
- 27.Zhang B, Sun X, Liu S, Deng X (2019) Recurrent neural network-based model predictive control for multiple unmanned quadrotor formation flight. Int J Aerosp Eng. https://doi.org/10.1155/2019/7272387 CrossRefGoogle Scholar
- 28.Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15CrossRefGoogle Scholar
- 29.Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):e0152173CrossRefGoogle Scholar
- 30.Iverson D (2008) Data mining applications for space mission operations system health monitoring. In: SpaceOps 2008 Conference, p 3212Google Scholar
- 31.Martínez-Heras JA, Donati A (2014) Enhanced telemetry monitoring with novelty detection. AI Mag 35(4):37–46CrossRefGoogle Scholar
- 32.Lavin A, Ahmad S (2015) Evaluating real-time anomaly detection algorithms–the numenta anomaly benchmark. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), pp 38–44. IEEEGoogle Scholar
- 33.Jeong H, Park B, Park S, Min H, Lee S (2018) Fault detection and identification method using observer-based residuals. Reliab Eng Syst Saf 184:27–40CrossRefGoogle Scholar
- 34.Hundman K, Constantinou V, Laporte C, Colwell I, Soderstrom T (2018) Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. arXiv:1802.04431
- 35.Ma D, Xia Y, Li T, Chang K (2016) Active disturbance rejection and predictive control strategy for a quadrotor helicopter. IET Control Theory Appl 10(17):2213–2222MathSciNetCrossRefGoogle Scholar
- 36.Jiajin L, Rui L, Yingjing S, Jianxiao Z (2017) Design of attitude controller using explicit model predictive control for an unmanned quadrotor helicopter. In: 2017 Chinese Automation Congress (CAC), pp 2853–2857. IEEEGoogle Scholar
- 37.Kuyumcu A, Bayezit I (2017) Augmented model predictive control of unmanned quadrotor vehicle. In: 2017 11th Asian control conference (ASCC), pp 1626–1631. IEEEGoogle Scholar
- 38.Cheng H, Yang Y (2017) Model predictive control and PID for path following of an unmanned quadrotor helicopter. In: 2017 12th IEEE conference on industrial electronics and applications (ICIEA), pp 768–773. IEEEGoogle Scholar
- 39.ElKholy HM (2014) Dynamic modeling and control of a quadrotor using linear and nonlinear approaches. American University in CairoGoogle Scholar