Fault compensation by online updating of genetic algorithm-selected neural network model for model predictive control
This paper presents a methodology that combines a dual-net model and the model predictive control (MPC) to compensate degraded system performance caused by slow-paced faults/anomalies. The dual-net model is comprised of an offline and an online artificial neural networks (ANNs) along with a switch that selects one of them for MPC. Through selective online updating of weight parameters, the online ANN is able to accurately capture the fault-induced variations in system dynamics, and can be used for MPC reconfiguration and fault compensation. Specifically, the system dynamics is identified by training a multilayer perceptron (MLP). To improve the model accuracy, a meta-optimization approach based on the genetic algorithm is applied to optimize the MLP hyperparameters and the training algorithm. A dual-thread decision maker is proposed to manage the robust model updating scheme and the dual-net model switch. A case study of numerical simulation using an unmanned quadrotor is undertaken to verify the feasibility of the proposed method to mitigate performance degradation. Salient performance in the response prediction and control, subject to gradually growing anomaly is successfully demonstrated. Quantitatively, the proposed updating model outperforms the offline ANN model and yields 2× and 4× lower errors, respectively, for prediction and control of the system response.
KeywordsNeural network Meta-optimization Model predictive control Anomaly mitigation
Machine learning techniques, in particular, artificial neural networks (ANNs), have emerged as effective and popular approaches to identify complex behavior in nonlinear systems in the past three decades and enable accurate and robust control [1, 2, 3]. This is because it is often difficult to construct accurate, physics-based, control-oriented models due to the complexity and unknown dynamics of the systems. Thus, the use of the ANN can overcome these issues and allows to capture the nonlinear behavior and develop high-quality control strategies in the form of a set of predefined mathematical structures .
Among different classes of ANN control systems, ANN-based model predictive control (MPC) has garnered significant interest due to its salient applicability to nonlinear model predictive control (NMPC) applications. ANNs serve as system models to forecast future dynamic behaviors, and these predictions then can be utilized by the controller to determine the optimal control inputs that minimize the predefined cost function. One of the key requirements of NMPC is an accurate system model of a simple mathematical structure to represent the nonlinear system behavior. Therefore, ANNs are highly desirable methods for identifying NMPC-compatible models. Moreover, MPC is able to easily incorporate input, output, and state constraints, which makes it a popular approach for many practical systems. ANN-based MPC has found broad uses in real-world applications, including a water level regulation of a tank unit, a piezoelectric actuator, and a stirred tank reactor [5, 6, 7]. They all utilize the most commonly used ANNs as the main architecture for system modeling, such as the multilayer perceptron (MLP) and radial basis function network (RBFN). An MLP consists of at least three layers of neurons (or nodes), where each neuron uses an activation function except for the input nodes. A RBFN is similar to MLP but neurons utilize radial basis functions as activation functions. Furthermore, the recurrent neural network (RNN) is also a widely used ANN structure for MPC in both system identification and control [8, 9]. For example, Han et al.  proposed a self-organizing RNN to control the dissolved oxygen concentration in a wastewater treatment process. An RNN has a unique structure where the outputs from the previous step are fed into the current step, which makes it a good approximator for time series data. In addition, a fully connected cascade (FCC) network, which has direct connections from all input neurons to all output and hidden neurons, was applied by Negri et al.  in MPC for pressure control of a water tank.
Albeit, various ANNs are capable of improving computational efficiency and control performance, ANN-based MPCs suffer from the difficulty to compensate disturbances . Mild disturbances caused by mismodeling and environment may be moderated by the receding horizon technique. Nevertheless, actual mechanical systems all undergo slow paced degradation, arising from wearing, tearing, corrosion, minor damage, and failure. This will cause the behavior drift of the actual system from the ANN model identified from the normal operation, leading to the deviation of the model prediction from the actual responses read from the sensors. Consequently, the performance of the MPC synthesized for the nominal (original) system will be compromised, causing a nonzero steady state tracking error also known as an offset. Offset-free tracking for MPC has been accomplished by disturbance modeling and observer design by various groups [13, 14, 15]. Another approach in dealing with such disturbances is to employ an adaptive ANN model [16, 17, 18]. The adaptive ANN model updates its weights (or even a structure) in real-time with the online collected data. When the system dynamics is shifted, changes in the system are projected into the data, enabling the ANN to learn and capture the new dynamics. The foremost merit of this method is that no a priori information of the disturbance is required. Nonetheless, it also has several distinct disadvantages. First, often the adaptive ANN updates its weights at every time step (or in every few steps), even when it is not necessary, which is computationally inefficient and makes it susceptible to noise when no anomaly is present. Second, usually to enable the online training, the selected ANN structure is too small to represent the actual system for a wide range of inputs and outputs. Lastly, there is limited measure to effectively circumvent overfitting or other training issues when the model is updated recursively, which may be a serious issue for safety–critical systems, such as vehicles and power plants.
In order to address aforementioned limitations, in this paper we present a robust and feasible ANN-based MPC methodology to maintain generalized accurate model representation and efficiently compensate for degraded performance due to dynamic behavior shifts that are mostly caused by slow-paced anomalies of actual mechanical systems, such as wearing, tearing, fatigue, corrosion, and etc. It should be noted that rapid, abrupt system faults or failures are not the focus of consideration in the present study. The framework includes three key components: ANN meta-optimization, dual-net model, and MPC. In our methodology, the system dynamics is described by an ANN-based plant model in the NARMAX formulation, for which the MPC will be developed. Therefore, the first step is to determine the ANN architecture for salient model performance, which can be achieved though the meta-optimization methods. ANN meta-optimization using evolutionary algorithms can be found in various literatures [19, 20, 21, 22]. In this effort the ANN topology of the plant model is optimized using the genetic algorithm (GA) to select the most appropriate values of the time window size of the input and output delays in the NARMAX formulation and the size of the hidden layer that minimizes the training and validation error. The concept of the dual-net model has been originally introduced by Puttige and Anavatti . Specifically, both offline and online ANN models in the form of multilayer perceptrons (MLPs) for the desired system are trained. Then the two ANNs are connected in parallel with a switch to select the one that predicts actual system response more accurately during the previous epochs. It is shown that the dual-net model is able to outperform the individual model in quantitative prediction. In this study, the dual-net model is built on the optimal MLP structure determined by the GA above. First the offline ANN model is trained beforehand in conjunction with the meta-optimization using a set of data of great diversity collected during previous nominal operations. The offline ANN remains unchanged throughout the entire period of the current operation and serves two purposes: (i) to identify the extent of the deviation in system dynamics as a result of the slow-paced anomalies above to inform the users; and (ii) to be utilized by the MPC whenever the online ANN is not ready for use, such as poor prediction due to overfitting or other training issues. Once the offline ANN becomes available, the online ANN model is initialized as a duplicate copy of the offline ANN. During the operation, the online ANN is updated whenever the system model prediction deviates notably from the actual system response due to anomalies. Then the MPC is used to design optimal control trajectories given the prescribed cost function and constraints in the presence of operational anomalies. Because of system degradation, its dynamics and responses read from sensors will deviate from those of the nominal system, and the MPC synthesized for the offline nominal system will exhibit steady state error even if it satisfies the control criteria. To tackle the issue, a dual-thread decision maker is proposed to manage the model updating and switch within the dual-net model and coordinate its prediction with the MPC to compensate the degraded system performance.
It should be pointed out that online ANN updating to mitigate the disturbance in the ANN-based MPC is established in literatures. Therefore, combining ANN and MPC is not the focus of the present study. Our objective is to combine the dual-net model and MPC and coordinate them in an organized manner during online operations to establish a deployable framework that safely maintains the performance level when the system is undergoing degradation. There are several novelties in the present effort that distinguish it from the existing work, including the use of GA-optimized ANN model in MPC, which to the best of our knowledge, has not been adequately investigated. The dual-thread decision maker to manage the online ANN updating and coordinate the dual-net model in MPC is also proposed, which serves as the cornerstone to organize the entire process for enhanced robustness and efficiency. Lastly, through the case study of numerical simulation, the feasibility of compensating degraded system performance in the presence of gradual anomalies by integrating the above key components is verified. The salient improvement in ANN prediction and control performance obtained through the proposed methodology relative to the non-updating benchmark is also demonstrated and quantitatively characterized.
This paper is organized as follows. In Sect. 2, the proposed framework/methodology of online ANN-based MPC, including GA-based meta-optimization, dual-net model, MPC, and dual-thread decision maker is described. In Sect. 3, a case study of controlling an unmanned quadrotor and the procedure of how to implement the anomaly is explained. The results of system identification and modeling using the GA, and prediction and control performance of the proposed methodology are discussed in Sect. 4. Finally, Sect. 5 concludes the paper with a summary of achievements and future work.
2 Online updated artificial neural network and model predictive control
The structure of the dual-net model is shown in Fig. 1b, which includes two ANNs: online and offline, connected in parallel to a switch in the work flow. The offline ANN trained beforehand remains unchanged throughout the operation. When the system is in a normal status, decision maker selects the off branch in the switch to utilize the offline ANN for MPC configuration. On the other hand, when the system anomaly occurs and causes the deviation of the model-predicted response from the actual system response, the online ANN is re-trained during operation with the accumulated data to accurately capture the latest system dynamics. One potential issue of the proposed methodology is that the online re-training of the network utilizing the biased training data may be overfitted, leading to poor control performance. Therefore, the decision maker decides in situ which model is better and should be used along with MPC by assessing the prediction accuracy of both models in the presence of anomalies. If the online updated ANN outperforms the offline ANN in prediction, the decision maker selects the on branch in the switch to use the former for MPC.
The model predictive control (MPC) module uses the dual-net model to generate a sequence of control signals for actuators at the desirable interval that drive the system to follow the reference signal and to mitigate adverse effects arising from the anomaly. Our MPC is based on the receding horizon technique, and its cost function considers the model-predicted response relative to the reference and the temporal variations of the control signals over a specified time horizon. A numerical optimization program is then harnessed to determine the control inputs that minimize a performance criterion over the horizon. The detailed description of each component in Fig. 1 will be presented in the following sections.
2.2 Genetic algorithm-guided neural network modeling
Hyperparameters of interest and their types and range for the ANN model
Hidden layer neurons
Integer (list index)
2.3 Model predictive control
2.4 Dual-thread decision maker
The first logic thread, on the left, is for the anomaly detection and model updating. There are a variety of anomaly detection approaches, including clustering, nearest neighbors, statistical, subspace, classifier and others [28, 29], and their applications have been reported in numerous systems [30, 31, 32, 33]. Nevertheless, finding the optimal one is out of scope of this research as our primary focus is to investigate the feasibility of the methodology to maintain system performance by updating the ANN model and MPC design during the operation using online data, especially when the plant experiences a slow-paced degradation or drift in system dynamics. Therefore, we employ the out-of-limits (OOL) approach, which is the most widely used method . OOL simply uses predefined threshold values, denoted as τ in Fig. 4, and alerts whenever the difference between the sensor and the predicted data exceeds the threshold value.
The first logic is divided into two stages, respectively, comparing Eoff and Eon against the threshold value τ in the given order. Recall that the offline ANN models the dynamics of the original, nominal system. Therefore, when the criterion Eoff > τ is true, it indicates the presence of the anomaly. The second criterion of Logic 1, i.e., Eon > τ is used to determine necessity of updating the online ANN model. If the online ANN has already been updated and allows accurate prediction, making Eon > τ false, then the online ANN updating becomes unnecessary (no update will be performed). On the other hand, if there are continuously growing anomalies or training failures that cause Eon > τ to be true, then the online ANN will be updated again until the prediction error reaches below the threshold. Several points should be noted: the online ANN is initialized with a copy of the offline ANN, that is, initially, the weights of the online ANN are the same as those of offline ANN. This will actually reduce the training time since the training process will not begin with randomly assigned weights, and the weights of offline ANN are essentially a good starting point for the online ANN updating. Moreover, when the anomaly is detected for the first time, the online ANN will also be updated immediately because of Eon = Eoff > τ.
Although updated, the online ANN does not guarantee to be utilized by the MPC. This is because online ANN training is highly susceptible to the overfitting and other related issues that may provide inaccurate prediction. If used by the MPC, it may deteriorate system performance or even lead to system failure. The second logic thread (on the right from the figure) and the switch in the dual-net model are introduced to prevent this issue. That is, the second logic thread compares the accuracies of the offline and the updated ANNs when predicting the actual plant responses for a specified time window, and then decides the action of switch. If the updated ANN outperforms the offline ANN, i.e., Eon < Eoff, then the switch is turned on and the updated ANN is used in MPC to compute the control actions and vice versa (as shown in Fig. 1b). Throughout the entire process both logic threads operate independently at every time step to determine when to update the online ANN and which model to use for MPC reconfiguration.
2.5 Dealing with overfitting for online training
Updating the ANN using the online data is a formidable task mainly due to the overfitting issue. This is because the range and the diversity of the online operational data is usually extremely limited. For example, if the objective of an unmanned aerial vehicle (UAV) during the operation is to maintain its flight at a certain altitude, then the data collected online from the UAV will have a small range in altitude variation. This is critical in ANN training since the foundation of ANN is to train a generalized model using a wide range of data. Moreover, once the controller is active in the closed loop, the operators no longer have direct manipulation on the control inputs applied to the physical plant. In other words, we can only provide reference values that the controller will strive to meet. Therefore, when the online data is accumulated with the controller in the loop, the data will depend on the control scheme. For instance, the control weighing factor (ρ) in MPC introduced in Sect. 2.3 restricts the changes in the input, and eventually restrains the diversity of data.
Accordingly, actions are required to prevent the ANN from overfitting. The effects of the data volume used for online ANN update is first investigated. We start with data sets of small sizes, and eventually find through trials that increasing the data volume reduces the overfitting effect, and the larger data volume is favorable to creating more generalized ANN models. However, an inordinately large data volume could significantly increase the data accumulation time and slow down the response rate of mitigating the disturbance. For this particular work, we decide to use 2 h of data accumulated online to update the ANN. The data volume will vary depending on the systems, disturbances and objectives.
Other than increasing the data volume, the technique of early stopping for ANN training is also used with the stronger condition. Early stopping is a way to terminate ANN training when the performance error of the validation set begins to grow while the performance error of the training set continues to decrease. This means that the ANN is being overfitted to the training set, and the network is losing its generality. Whenever this event occurs, the training algorithm counts the number of occurrences. Once the number exceeds the predefined value, the training stops. Stronger condition refers to reducing this predefined value and increasing the ratio of the validation set with respect to the training set.
Lastly, another means we take to mitigate the overfitting issue during online training is to apply Bayesian regularization (BR) as the training algorithm. Although Levenberg–Marquardt (LM) is found to be the excellent algorithm for the offline ANN modeling, due to its fast convergence rate, there is a large chance of being overfitted for the online use (see Sect. 4.1 below). In other words, LM allows more rapid changes in the weights than the BR within one iteration. Applying early stopping in conjunction with BR is found to be a good option for the present work.
3 Case study and numerical experiment
To verify the concept of the online ANN updating and degraded performance compensation, an unmanned quadrotor system is chosen to represent the actual plant, which has a well-known, physics-based mathematical model that is easily accessible. MPC for unmanned quadrotors has been demonstrated recently by several groups [35, 36, 37, 38]. Zhang et al.  recently proposed ANN-based MPC for formation flight of multiple unmanned quadrotors, which uses RNN to update the weight parameters at every time step that is different from our approach based on the dual-net model.
3.1 Plant model
3.2 Offline model training and reference signal
For system identification, two separate offline MLPs are trained with the model structure described in Sect. 2.2, each representing a multi-input single-output (MISO) system. In other words, two MLPs are trained to predict the yaw angle and altitude, separately. This is a more suitable approach than training a single ANN that represents a multi-input multi-output (MIMO) system, because the model accuracy can be compromised if two totally different motions are modeled from the same set of weights. Also if both ANNs are not separated, they need to be updated simultaneously for both the yaw angle and the altitude. On the other hand, if the ANN is only responsible for estimating a single state, then the ANN updates can be performed independently. Again, the online ANN model is initialized as a copy of the offline ANN model at the beginning of the simulation.
4 Results and discussion
4.1 Artificial neural network hyperparameter selection
Selected hyperparameters for ANN model by the genetic algorithm
4.2 Anomaly compensation by ANN-based MPC
Comparison of control and prediction errors between the offline ANN and the dual-net model
Control error (°)
Prediction error (°)
Control error (m)
Prediction error (m)
A methodology is proposed to integrate the dual-net model, which consists of the offline and online ANNs, and the model predictive control (MPC) to compensate for the degraded performance caused by slow-paced, continuously growing anomalies in mechanical systems. The foremost novelty lies in the combination of the dual-net model with MPC and the dual-thread decision maker to independently determine and organize the online ANN model updating and the model switch for MPC. The new elements proposed will improve the online learning/updating efficiency and ANN model robustness, and hence, opening up new possibilities to realize operational autonomy for mechanical systems with anomalies on the computing resource-limited platform.
The ANN system identification/modeling based on the MLP is used to construct the offline baseline model, and further improved in prediction accuracy by the GA to select the optimal network structure and hyperparameters, including the time window size for input and output delays and the hidden layer size, and also the training algorithm. Such an optimized MLP is used to initialize another copy of the online ANN model, which along with the offline ANN model forms the aforementioned dual-net model and will be updated online as necessary. The dual-net model is then combined with the MPC for online synthesis of control actions to be applied to the physical plant. Under the dual-thread decision maker framework, new ANN updating and switch schemes for MPC are proposed. That is, when the ANN prediction accuracy is worse than the prescribed threshold value, the system is triggered to accumulate the operational data for a specified period of time followed by online ANN training using the accumulated data, in which the structure of the online ANN remains unchanged and only the weights are updated. Both the offline and the online ANNs run in parallel throughout the simulation and are compared with the actual plant response, and the one exhibiting better prediction accuracy is selected for MPC prediction in the next horizon. Finally, the case study of the unmanned quadrotor model is undertaken to verify the proposed methodology through numerical simulation. The dual-thread decision maker and the dual-net model demonstrate salient performance in both the accuracy of predicting the actual plant response and the quality of system control subject to growing anomaly. In summary, the updated ANN-based MPC outperforms that solely based on the offline ANN in the presence of anomaly as manifested quantitatively by 4× and 2× reduction in the control and the prediction error. The results verify the feasibility of compensating the degraded performance caused by the shifts in system dynamics.
Despite salient results, there are several limitations in the current method for ANN updating. The authors implemented several techniques to address the overfitting issue. However, there are still few spikes of errors remaining in the predicted results. This will be more critical when the online data is even less diverse. Therefore, switching model for MPC synthesis is used as an additional means of security to ensure desirable MPC performance and stable operation. The root of the overfitting issue is attributed to the large number of fitting parameters (about 1500) during online training. Nonetheless, most of the anomaly scenarios (e.g., loose-fitting, wearing, fatigue, and other) occur in a gradual manner and cause incremental variations in system dynamics. Therefore, updating the entire network weights each time may be unnecessary in terms of both resource usage and model quality. The future research will focus on further investigating and mitigating these issues.
This research was sponsored by the DoD/Army Research Laboratory (ARL) under the Contract Number W911QX-18-P-0180. The authors would like to acknowledge Mr. Eric Mark at the ARL for his support and feedback on the present work.
Compliance with ethical standards
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
- 15.Sena HJ, Ramos VS, Silva FV, Fileti AMF (2017) Adaptive offset remover based on Kalman filter integrated to a model predictive controller. Chem Eng 57:1093–1098Google Scholar
- 19.Lam HK, Ling SH, Leung FH, Tam PKS (2001) Tuning of the structure and parameters of neural network using an improved genetic algorithm. In: The 27th annual conference of the IEEE Industrial Electronics Society, 2001. IECON’01, vol 1, pp 25–30. IEEEGoogle Scholar
- 23.Puttige VR, Anavatti SG (2008) Real-time system identification of unmanned aerial vehicles: a multi-network approach. JCP 3(7):31–38Google Scholar
- 30.Iverson D (2008) Data mining applications for space mission operations system health monitoring. In: SpaceOps 2008 Conference, p 3212Google Scholar
- 32.Lavin A, Ahmad S (2015) Evaluating real-time anomaly detection algorithms–the numenta anomaly benchmark. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), pp 38–44. IEEEGoogle Scholar
- 34.Hundman K, Constantinou V, Laporte C, Colwell I, Soderstrom T (2018) Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. arXiv:1802.04431
- 36.Jiajin L, Rui L, Yingjing S, Jianxiao Z (2017) Design of attitude controller using explicit model predictive control for an unmanned quadrotor helicopter. In: 2017 Chinese Automation Congress (CAC), pp 2853–2857. IEEEGoogle Scholar
- 37.Kuyumcu A, Bayezit I (2017) Augmented model predictive control of unmanned quadrotor vehicle. In: 2017 11th Asian control conference (ASCC), pp 1626–1631. IEEEGoogle Scholar
- 38.Cheng H, Yang Y (2017) Model predictive control and PID for path following of an unmanned quadrotor helicopter. In: 2017 12th IEEE conference on industrial electronics and applications (ICIEA), pp 768–773. IEEEGoogle Scholar
- 39.ElKholy HM (2014) Dynamic modeling and control of a quadrotor using linear and nonlinear approaches. American University in CairoGoogle Scholar