Hybrid deep learning diagonal recurrent neural network controller for nonlinear systems

In the present paper, a hybrid deep learning diagonal recurrent neural network controller (HDL-DRNNC) is proposed for nonlinear systems. The proposed HDL-DRNNC structure consists of a diagonal recurrent neural network (DRNN), whose initial values can be obtained through deep learning (DL). The DL algorithm, which is used in this study, is a hybrid algorithm that is based on a self-organizing map of the Kohonen procedure and restricted Boltzmann machine. The updating weights of the DRNN of the proposed algorithm are developed using the Lyapunov stability criterion. In this concern, simulation tasks such as disturbance signals and parameter variations are performed on mathematical and physical systems to improve the performance and the robustness of the proposed controller. It is clear from the results that the performance of the proposed controller is better than other existent controllers.


Introduction
Since many real-time implementations of nonlinear systems involve nonlinearity, nonlinear systems play an important role in engineering research that are defined as systems whose manner is not proportional to their inputs [1]. These systems suffer inherent uncertainty, time-varying parameters and nonlinear dynamic behavioral. In this concern, non-optimal control suffers from some limitations due to the assumptions made for the control system such as linearity and time-invariance. Therefore, the non-optimal control methods are not suitable for controlling nonlinear systems in practical applications [2,3]. These problems can be overcome by optimal control techniques. Artificial intelligence (AI)-based controllers are one of these techniques, which have many advantages, such as [4,5]: (1) it can lead to better performance when properly tuned. (2) It requires less tuning effort than non-optimal controllers. (3) It can be designed based on data from the real system or plant if an expert knowledge is not available. (4) It can be designed using a combination of linguistic and responsebased information. [6][7][8][9]. The neural networks (NNs) are considered as one of AI, which are a series of algorithms to recognize underlying relationships in a set of data through a process, which like the operation of human brain [10].
In this concern, there are various structures of NNs such as recurrent neural networks (RNNs) [11][12][13][14] and multilayer feed-forward neural networks (MLFFNNs) [15,16]. MLFFNN is called static network, where there are not tapped delay lines. Several MLFFNN models from observational data were created for predicting the groundwater levels [17]. The NN controller was designed as a direct adaptive inverse control based on MLFFNN to control and estimate the model of nonlinear plants [18]. In [19], the researchers designed MLFFNN for classification of nonlinear mappings based on input and output samples.
However, there are tapped delay lines in RNN, and this is called a dynamic network. The RNN is more robust than MLFFNN because it contains the MLFNN framework with tapped delay lines [20]. Various RNN structures exist in this regard, including Elman NNs; the feedback connections from the output to the input of the hidden layer are performed via a context layer. Another RNN is Jordan NNs; the feedback connections from the output to the input layer are performed via a state layer. [21][22][23]. Recently, the fully connected RNN (FCRNN) was modified to provide a diagonal RNN (DRNN). In the hidden layer, the DRNN contains self-recurrent neurons that feed only their output back into themselves, not to other neurons in the same layer. [24][25][26]. In [27], in order to achieve high performance of the shunt active power filter, researchers designed a controller based on RNN. A self-organizing RNN for the nonlinear model predictive controller was designed to foresee the nonlinear systems behavior [28]. In [29], a flexible manipulator was designed with a DRNN controller to limit backward vibration, which is performed based on a shaking control signal generator and an online identification system. The DRNN was introduced as a controller and an observer for estimating the anonymous dynamics of the nonlinear system [30]. DRNN was developed to determine the optimal parameters of the PID controller for controlling induction motors [31]. In [32], based on the control inputs and current quadrotor states, researchers developed the PID controller using virtual sensing based on DRNNs and Kalman filters to predict the immeasurable cases of the quadrotor system.
Machine learning (ML) is one of the applications of AI that can automatically learn from experience without explicit programming. ML focuses on the development of programs that can access data and use it to learn them [33]. In this significance, the deep learning (DL) is a part of a wider family of ML based on ANN's that learn representations either supervised, unsupervised, or semi-supervised [34]. In this concern, there are various applications of DL exist as follows: (1) In automation systems, an approach for detecting and assessing food waste trays based on hierarchical DL algorithm was presented [35], (2) in medicine field, a DL algorithm was used to classify and predict mutations from non-small cell lung cancer histopathology images [36], (3) in agriculture field, a DL algorithm was introduced to locate paddy fields at the pixel level for a whole year long and for each temporal instance based on real imagery datasets of different landscapes from 2016 to 2018 [37], and (4) in recognition, a DL algorithm was used for real-time modeling of the human activity recognition with smartphones [38]. According to definitions [39,40], the DL of NN's includes two steps: firstly is the unsupervised training and secondly is using the weights from the unsupervised training for initializing the multilayer NNs. This is considered as the main advantage of DL because the initializing weights process is very critical issue.

Literature review
In [41], the parameters of the classical PID controller were tuned based on DL for controlling maglev train, which is a new type of the ground transports. The deep NNs (DNN) were introduced for dynamical systems modeling based on complex manner [42]. Three DNN structures are trained on successive data for studying validation of these networks in modeling of dynamical systems. In [43], DL was designed for analyzing the performance of a nonlinear continuous stirred tank reactor, which trained its weights tuned by hybrid algorithm. The DL was introduced as a hybrid algorithm with the fuzzy system for tuning the parameters of the PID controller [44], which was used for controlling the speed of brushless DC motor. In [45], DL controller was introduced, which is performed based on the MLFFNN and the RBM. It is used for initializing the weights values of a network for the nonlinear systems. In [46], DL was introduced for modeling the nonlinear systems based on Elman RNN and restricted Boltzmann machine (ERNN-RBM), which is considered as an unsupervised method for initializing only the first layer.

Motivation
It is evident from the literature review that DL applications are widely used for modeling systems and it does not cover the control research. Since nonlinear systems suffer from external disturbances and uncertainties, the main purpose of the present paper is to shed further light on the design of stable controllers for overcoming nonlinear system problems. In this concern, self-organizing map (SOM) is an unsupervised learning algorithm trained using dimensionality reduction (typically two-dimensional), discretized representation of input space of the training samples, and called a map. It differs from other ANN as it depends on the competitive learning and not the error-correction learning (like backpropagation with gradient descent). It uses a neighborhood function to preserve the topological properties of the input space to reduce data by creating a spatially organized representation, and also, it helps to discover the correlation between data [47][48][49][50]. On the other hand, the RBM is an unsupervised learning algorithm that makes inferences from input data without labeled responses. The controllers and models based on NNs are always stuck with the initialized weights. If the initialized weights are not appropriate, the network gets stuck in local minima and leads the training process to a wrong ending and the network becomes infeasible to train therefore. Therefore, RBM is used to overcome this problem [46,51]. Hence, a merge utilizing the features of SOM and RBM is proposed to improve the learning performance of the proposed network. To control the nonlinear systems, the hybrid deep learning DRNN controller (HDL-DRNNC) based on SOM and RBM is proposed. Initial weights for the DRNN are obtained using a hybrid deep learning (HDL) procedure, which is regarded as an unsupervised learning procedure. The HDL is performed based on RBM and self-organizing map (SOM) of the Kohonen procedure.
To ensure the stability of the adaptation parameters laws, the Lyapunov procedure is applied. The proposed HDL-DRNNC is trained quickly for keeping the trajectory and overcoming the system parameters variations and external disturbances. As shown in the simulation results, these features of the proposed HDL-DRNNC make it more robust than those of other controllers under the same conditions.

Novelties and contributions
The main contributions of the paper are summarized as: • A new HDL for DRNN controller is proposed for nonlinear systems. • Developing the updating weights law for the DRNN of the proposed algorithm using Lyapunov theory to achieve stability. • Compared to other existing controllers, the HDL-DRNNC can handle problems of system uncertainties in both a mathematical system and a physical system.
The organization of the paper is as follows: the DRNN structure is exhibited in Sect. 2. The proposed HDL-DRNNC is introduced in Sect. 3. The weights updating based on Lyapunov stability criterion is introduced in Sect. 4. The HDL-DRNNC pseudocode is explained in Sect. 5. The simulation results for the mathematical and physical nonlinear systems are introduced in Sect. 6. At final, Sect. 7 exhibits the conclusion, which followed by the references.

Diagonal recurrent neural network structure
As shown in Fig. 1, the structure of DRNN consists of four layers, namely two hidden layers, an input layer, and an output layer. Input layer: the external input vector is represented by E j ð Þ ¼ e x1 ½ j ð Þ Á Á Á e xn j ð Þ T and < 1 ð Þ j ð Þ is the input weight matrix, which links the input vector to the hidden layer (1) neurons and it is represented as: Generally, < 1 ð Þ Jn j ð Þ denote the input weight between input neuron n and the hidden layer (1) neuron J.
Hidden layer (1): the output of each node is denoted by v ð1Þ j j ð Þ, which is specified as: where ð Þ T denotes the diagonal weight vector at the hidden layer (1), J is the nodes number, T j j ð Þ denotes the threshold value for every node, and f ðÞ denotes hyperbolic tangent function, which is defined as: and its derivative can be defined by (2): the output of each node is denoted by v ð2Þ m j ð Þ, which is specified as:

Hidden layer
ð Þ T denotes the diagonal weight vector at the hidden layer (2), M denotes the nodes number, T m j ð Þ denotes the threshold value for each node and < ð2Þ j ð Þ denotes the weights matrix between the hidden layer (1) and the hidden layer (2), which is defined as: Generally, < 2 ð Þ MJ j ð Þ denote the weight between the hidden layer (1) neuron J and the hidden layer (2) neuron M.
Output layer: its output is denoted by u j ð Þ, which is calculated as: where T j ð Þ denotes the threshold value and < ð3Þ j ð Þ ¼ between the hidden layer (2) and the output layer.

Hybrid deep learning diagonal recurrent neural network controller
The initializing weights process for DRNN controllers is very critical issue. Where, if this process is zero or not suitable, then the DRNN will stumble in local minimum and it will lead to wrong network termination as learning due to the initial layers learning of a network will become impossible [46]. This issue may be leading the controller to become unstable. For this reason, DL is proposed. In this regard, any NN with more than one hidden layer is referred to as a deep network, which can be learned by DL. The proposed HDL-DRNNC consists of DRNN that can be trained by DL. The DL algorithm based on SOM and RBM is considered as an unsupervised learning for initializing the weights values of the DRNN. DRNN's two hidden layers, which are described in the previous section, are trained using SOM algorithm. On the other hand, the RBM algorithm is used for training the DRNN output layer. The proposed HDL-DRNNC structure is shown in Fig. 2.

SOM of the Kohonen learning procedure
The initializing weights values for the hidden layers of the DRNN, which is introduced in Sect. 2, are the main purpose of this section. The NN, which is used to initialize the weights of the hidden layers of the DRNN, is shown in Fig. 3. The weights of the NN are trained based on SOM of the Kohonen process [52]. The training is performed based on the hypothesis that one of the layer neurons responds most to the input, which is called the winner neuron. For training the hidden layer (1) of the DRNN, the number of neurons in the input layer of the NN (Fig. 3), which is used to initialize the weights of the hidden layers, is equal to the number of input neurons of the hidden layer (1) of the DRNN where their values are Þ . f denotes the number of neurons in the input layer of the NN (Fig. 3) and i denotes the number of the input neurons of the hidden layer (1)   Step 1: All the weights -ðQÞ lf ; Q ¼ 1 ; 2 of the NN, which is shown in Fig. 3, are initialized at zero values.
Step 2: Enter the values of E j ð Þ to the NN.
Step 3: The winner neuron x is selected using the Euclidean distance between the input and the neuron weights -ðQÞ x as: where Q, l and x are the number of the NN layer, the index of any neuron and the index of the winner neuron, respectively.
Step 4: Calculate the Gaussian neighborhood function as: where ' and f are constants.
Step The updating law of the weights of the NN -ðQÞ lg ðjÞ, where Q ¼ 2, is obtained as:

Restricted Boltzmann machine
Initializing the weights values for the output layer of the DRNN is performed based on RBM [45,53]. The RBM that is used in this section is illustrated in Fig. 4. Where all the weights of the output layer are equal to zero, RBM contains two main layers: firstly, the visible layer, which contains a visible nodes group S and secondly, the hidden layers, which contains a hidden nodes group D [46,54]. Þ . In this paper, P ¼ 1. Based on the approach in [53,55], Hinton introduced contrastive divergence (CD) for training RBM. The RBM input is Sðr À 1Þ, which shifts to the visible layer at time ðr À 1Þ. Then, the hidden layer output is obtained as: where X ji represents the weight between a visible node i and a hidden node j and S i represents the binary state of the visible node. P and N are the hidden nodes number and the visible nodes number, respectively. B ¼ B 1 Á Á Á B P ½ T represents the hidden nodes biases and F denotes sigmoid activation function F z ð Þ ¼ 1= 1 þ exp Àz ð Þ ð Þ . The inverse layer reconstructs the data from the hidden layer. As a result, S r ð Þ is obtained at r as follows: where X ij represents the weight between a hidden node j and a visible node i, D j represents the binary state of a hidden node and A ¼ A 1 Á Á Á A N ½ T represents the CD À < case: The parameters learning rules for the weights and biases of nonlinear RBM are clarified as [55]: where e is the RBM learning rate and j is the iteration number. When the parameters of RBM are learned, hence the output layer of the DRNN can be initialized based on the weights of RBM X ji j þ 1 ð Þ.

Weights updating based on Lyapunov stability
The performance function is denoted by E l j ð Þ, which is defined as: where ! d ðjÞ and ! a ðjÞ denote the reference input and the actual output, respectively. The DRNN is trained to minimize the error signal [56]. To achieve stability, the updating weights of DRNN of the proposed HDL-DRNNC are developed using Lyapunov stability criteria. Two conditions must be met in order to the system to be asymptotically stable, as outlined in Eqs. (21 and 22) where R x j ð Þ is a positive definite function. The updating weight equation can be expressed as a common form: where U l j ð Þ and DU l j ð Þ denote a generalized weight vector and its desired modification and the learning rate is denoted by g.

Theorem 1
To achieve the stability of the controlled process, the updating equation for the DRNN weights of the proposed scheme is obtained as the following: where b ; r and k are positive constants.   Proof Suppose the next Lyapunov function: Fig. 15 Output response for the EVS (Task 1) The term r 2 e x j þ 1 ð ÞU l j þ 1 ð Þ ð Þ 2 can be defined based on Taylor series in the linear form as [24]: where HOT denotes to the higher order term, which can be ignored. Therefore, Eq. (29) can be rewritten as: The right side of the previous equation is rewritten as follows: Similarity, Equation (32) can be rewritten as: Then, by replacing the term oe x j ð Þ oU l j ð Þ DU j ð Þ in Eq. (31), we obtain the following: Similarity, The second condition for stability is determined as: Equation (35) can be rewritten as: where n ! 0, so as to guarantee the condition, DR x j ð Þ 0, the following equation is obtained: The general quadratic equation is determined as: The roots of Eq. (38) are calculated as: From Eqs. (37) and (38), obviously, DU l k ð Þ acts as v in Eq. (38) and the values of c; b and a in Eq. (37) are obtained as: There is a single unique solution for Eq. Fig. 20 The EVS control signal (Task 3)  and, n can be calculated as: Since n ! 0, which means So, the unique root of Eq. (37) is v 1;2 ¼ Àb 2c , similarly, Equation (44) can be reformulated as: So, by replacing DU l j ð Þ in Eq. (23), the updating equation for the parameters of the DRNN of the HDL-DRNNC can be given as in Eq. (24).

Steps of the proposed HDL-DRNNC
The system block diagram with the proposed HDL-DRNNC is shown in Fig. 5. The error signal e x j ð Þ is the difference between the reference input ! d ðjÞ and the actual output of the nonlinear system ! a ðjÞ. The proposed controller input is e x j ð Þ and its output is the control signal u j ð Þ, which forward to the nonlinear system. As shown in Figs. 1 and 2, the first layer of the HDL-DRNNC contains three inputs, which are the error signal e x1 j ð Þ ¼ e x j ð Þ, the change of error signal e x2 j ð Þ ¼ e x j ð Þ À e x j À 1 ð Þ and the change of the change of error signal e x3 j ð Þ ¼ e x j ð Þ À 2e x j À 1 ð Þþe x j À 2 ð Þ. The output layer contains one output u j ð Þ. Algorithm 1 summarizes the proposed HDL-DRNNC pseudocode for reader's convenience.

Simulation results
A comparison of the proposed HDL-DRNNC and DRNNC is performed under the same conditions with zero initial weights to show the performance of the hybrid learning algorithm.
In the present paper, assign J ¼ M ¼ l ¼ 10; n ¼ 3, R ¼ 10, N ¼ 10, < ¼ 1 and P ¼ 1. In order to evaluate the performance and demonstrate the robustness of the proposed controller, the mean absolute error (MAE) and the root-mean-square error (RMSE) are used. MAE and RMSE are clarified as [57,58]: where K L denotes iterations number.

Task 1: tracking the reference signal trajectory
The reference signal in this task is defined as: where T denotes the sampling period. Figures 6 and 7 exhibit the system response and the control signal for the proposed HDL-DRNNC and DRNNC. It is clear that there is an error between the setpoint and the system output under using the DRNNC at the beginning of simulation task. However, the proposed controller using hybrid learning algorithm based on SOM and RBM is able to track the set-point without a steadystate error.

Task 2: uncertainty due to disturbance
To evaluate the robustness of the proposed HDL-DRNNC, a disturbance value of 50% of its desired output is added to the system output at j ¼ 2950. Figure 8 illustrates that the system output tracks the set-point without a steady-state error after adding a disturbance value to the measured output for the proposed HDL-DRNNC. After adding a disturbance value for the DRNNC, there is still a steadystate error. Figure 9 exhibits the control signal of the system for both controllers. In contrast with the DRNNC, the proposed HDL-DRNNC clearly responds to disturbance effects.

Task 3: Uncertainty due to disturbance with parameters variation
At j ¼ 2950 instant and after the system output reaches the reference input, the system parameters are varied as follows: a 1 j ð Þ ¼ 0:35; a 2 j ð Þ ¼ À0:35; a 3 j ð Þ ¼ À 1:5; a 4 j ð Þ ¼ 2 :5 and a 5 j ð Þ ¼ 1 with an effect 40% disturbance. The output response and the control signal of the system are shown in Figs. 10 and 11, respectively, for both controllers. It is evident that the proposed HDL-DRNNC is capable of responding to the uncertainty effects (the parameters variation and disturbance) as compared to the DRNNC.

Task 4: uncertainty due to noise
A random noise signal is added at j ¼ 2950 instant. The output response and the control signal of the system are shown in Figs. 12 and 13, respectively. In contrast to the output response based on DRNNC, the proposed HDL-DRNNC has the ability to recover from the impact of random noise more quickly. The proposed HDL-DRNNC is more robust than the DRNNC. Tables 1 and 2 illustrate the values of MAE and RMSE for the proposed HDL-DRNNC, the DRNNC and other schemes, that are published previously such as feed-forward neural network based on RBM (FFNN-RBM) [45], ERNN-RBM [46], feed-forward neural network with hybrid learning controller (FFNNHLC) [61], FCRNNC [62] and adaptive interval type-2 Takagi-Sugeno-Kang fuzzy logic controller based on reinforcement learning (AIT2-TSK-FLC-RL) [63]. On the other hand, the proposed HDL-DRNNC is compared with DRNNC based on SOM (DRNNC-SOM) to show the benefits of hybrid learning.
The MAE and RMSE values for the proposed HDL-DRNNC scheme are clearly smaller than those obtained for other schemes. Due to use of DL to initialize the weight values of the proposed HDL-DRNNC, it can reduce the impact of the system uncertainties caused by external disturbance, parameter variations, and random noise compared with other schemes. The hybrid algorithm is used because it gave better results than the DRNNC-based SOM algorithm as shown in Tables 1 and 2.

Case 2: physical system
In this section, the proposed controller is used for controlling a physical system, which is the electrical vehicle system (EVS). Nowadays, EVSs are increasingly advancing because of the importance of environmental protection and lack of energy sources [64]. The control of EVSs is important role in order to determine a high-performance EVS with an optimal balance of travelling range per charge, maximum speed and acceleration performance [64]. EVSs are basically time-variant (e.g. the EVS parameters and the road condition are consistently varying) nonlinear system, which making the control of an EVS quite cumbersome [64]. Therefore, the control of EVS should be designed robustly and adaptively to improve the system in both dynamic and steady performance states. Figure 14 shows the schematic diagram of an EVS and the mathematical model is given as [63][64][65][66]: where x and i denote to angular speed and angular speed of the motor. C f ; C a ; < f and < a denote the field inductance, armature inductance, the field resistance and armature resistance, respectively.J denotes the inertia of the motor, r e denotes the tire radius of the EVS, which includes the tires with gearing system, C af is the mutual inductance between the armature and the field windings, and q, A and m denote the air density, the frontal area of the vehicle and the mass of the EVS, respectively. l rr ,B,G and C d denote the rolling resistance coefficient, the viscous coefficient, the gearing ratio and the drag coefficient, respectively. The values of EVS parameters are listed in Table 3 In this task, the set-point changing is carried for testing the proposed HDL-DRNNC, which is compared with the DRNNC. It is clear that the EVS response using the proposed HDL-DRNNC reaches the set-point faster than the DRNNC. 6.2.2 Task 2: uncertainty due to parameters variation with disturbance In this task, the EVS parameters are varied as in Table 4 with an effect 40% disturbance after the system output reaches the reference input at t ¼ 240 sec. The system response and its control signal for both controllers are exhibited as in Figs. 17 and 18. It is clear that the robustness of the proposed HDL-DRNNC is better than the DRNNC due to its ability of reducing the effect of system uncertainties.

Task 3: uncertainty due to random noise
A random noise signal is added at t ¼ 240 sec. The EVS response and its control signal are shown in Figs. 19 and 20. It is clear that the EVS response for the proposed HDL-DRNNC is quickly recovering from the impact of random noise as compared to the output response based on DRNNC. The robustness of the proposed HDL-DRNNC is better than that compared with DRNNC. The analyses of the MAE and RMSE values for the proposed HDL-DRNNC, the DRNNC and other schemes are presented in Tables 5 and 6. It is clear that the MAE and RMSE values for the proposed HDL-DRNNC are smaller than those obtained for other schemes. Compared with other schemes, HDL-DRNNC has the ability to reduce the impact of system uncertainties.
The main features of the proposed HDL-DRNNC are gathered as follows: (1) It has a swift learning control due to its use of hybrid DL, which uses SOM and RBM to initialize the weights values, (2) the controller is stable as it uses the Lyapunov stability method to update the weight values and it guarantees the stability, and (3) it is successful for reducing the system uncertainties and tracking the performance output for both mathematical system and physical system.

Conclusion
In the present paper, the HDL-DRNNC is proposed for nonlinear systems. The HDL-DRNNC uses the DRNN, which can be learned from HDL. In order to guarantee the stability of the proposed controller, the updating weights of the DRNN are derived using the Lyapunov stability criterion. Two nonlinear systems, namely mathematical and physical, are used to estimate the performance of the proposed controller. According to the obtained results, the proposed HDL-DRNNC can overcome uncertainty and track the performance of the controlled systems. By comparing MAE and RMSE indicators, it is evident that the response of mathematical and physical systems based on HDL-DRNNC is able to recover fast from the effects of uncertainties as compared with the response of mathematical and physical systems based on DRNNC and other existing controllers. As conclusion, HDL-DRNNC robustness has superior performance and a faster ability to recover from uncertainty as compared to DRNNC and other controllers. In the future work, the authors will try to implement practically the proposed algorithm using microcontrollers for controlling a real system.

Declarations
Conflict of interest There is no conflict of interest between the authors to publish this manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.