Balanced simplicity–accuracy neural network model families for system identification

Romero Ugalde, Hector M.; Carmona, Jean-Claude; Reyes-Reyes, Juan; Alvarado, Victor M.; Corbier, Christophe

doi:10.1007/s00521-014-1716-8

Balanced simplicity–accuracy neural network model families for system identification

Original Article
Published: 25 September 2014

Volume 26, pages 171–186, (2015)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Hector M. Romero Ugalde¹,
Jean-Claude Carmona²,
Juan Reyes-Reyes³,
Victor M. Alvarado³ &
…
Christophe Corbier⁴

493 Accesses
27 Citations
1 Altmetric
Explore all metrics

Abstract

Nonlinear system identification tends to provide highly accurate models these last decades; however, the user remains interested in finding a good balance between high-accuracy models and moderate complexity. In this paper, four balanced accuracy–complexity identification model families are proposed. These models are derived, by selecting different combinations of activation functions in a dedicated neural network design presented in our previous work (Romero-Ugalde et al. in Neurocomputing 101:170–180. doi:10.1016/j.neucom.2012.08.013, 2013). The neural network, based on a recurrent three-layer architecture, helps to reduce the number of parameters of the model after the training phase without any loss of estimation accuracy. Even if this reduction is achieved by a convenient choice of the activation functions and the initial conditions of the synaptic weights, it nevertheless leads to a wide range of models among the most encountered in the literature. To validate the proposed approach, three different systems are identified: The first one corresponds to the unavoidable Wiener–Hammerstein system proposed in SYSID2009 as a benchmark; the second system is a flexible robot arm; and the third system corresponds to an acoustic duct.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Development and Application of Artificial Neural Network

Article 30 December 2017

Yu-chen Wu & Jun-wen Feng

Siamese Neural Networks: An Overview

Fundamentals of Artificial Neural Networks and Deep Learning

Abbreviations

$J_{u} \in R^{1\times n_b}$ :: Input regressor vector
$J_{\hat{y}} \in R^{1\times n_a}$ :: Output regressor vector
$n_a \in R^{1\times 1}$ :: Number of pass outputs of the system
$n_b \in R^{1\times 1}$ :: Number of pass inputs of the system
$X \in R^{1\times 1}$ :: Synaptic weight
$Z_{b} \in R^{1\times 1}$ :: Synaptic weight
$Z_{a} \in R^{1\times 1}$ :: Synaptic weight
$V_{b_i} \in R^{1\times 1}$ :: Synaptic weight
$V_{a_i} \in R^{1\times 1}$ :: Synaptic weight
$Z_h \in R^{1\times 1}$ :: Synaptic weight
$W_{b_{i}} \in R^{1\times n_b}$ :: Synaptic weight
$W_{a_i} \in R^{1\times n_a}$ :: Synaptic weight
$W_{B} \in R^{1\times n_b}$ :: Synaptic weight
$W_{A} \in R^{1\times n_a}$ :: Synaptic weight
$V_{B} \in R^{1\times 1}$ :: Synaptic weight
$V_{A} \in R^{1\times 1}$ :: Synaptic weight
$Z_H \in R^{1\times 1}$ :: Synaptic weight
$X^* \in R^{1\times 1}$ :: Synaptic weight after training
$Z_{b}^* \in R^{1\times 1}$ :: Synaptic weight after training
$Z_{a}^* \in R^{1\times 1}$ :: Synaptic weight after training
$V_{b_i}^* \in R^{1\times 1}$ :: Synaptic weight after training
$V_{a_i}^* \in R^{1\times 1}$ :: Synaptic weight after training
$Z_h^* \in R^{1\times 1}$ :: Synaptic weight after training
$W_{b_{i}}^* \in R^{1\times n_b}$ :: Synaptic weight after training
$W_{a_i}^* \in R^{1\times n_a}$ :: Synaptic weight after training
$W_{B}^* \in R^{1\times n_b}$ :: Synaptic weight after training
$W_{A}^* \in R^{1\times n_a}$ :: Synaptic weight after training
$V_{B}^* \in R^{1\times 1}$ :: Synaptic weight after training
$V_{A}^* \in R^{1\times 1}$ :: Synaptic weight after training
$Z_H^* \in R^{1\times 1}$ :: Synaptic weight after training
$\varphi _{1}$ :: Activation function (linear or nonlinear)
$\varphi _{2}$ :: Activation function (linear or nonlinear)
$\varphi _{3}$ :: Activation function (linear or nonlinear)
$nn \in R^{1\times 1}$ :: Number of neurons
$e_{{\mathrm{sim}}}$ :: Simulation error
$\mu _t$ :: Mean value of the simulation error
$s_t$ :: Standard deviation of the error
$e_{{\mathrm{RMS}}t}$ :: Root mean square (RMS) of the error
$u$ :: Input of the neural network
$\hat{y}$ :: Output of the neural network

References

Aadaleesan P, Miglan N, Sharma R, Saha P (2008) Nonlinear system identification using Wiener type Laguerre–Wavelet network model. Chem Eng Sci 63(15):3932–3941. doi:10.1016/j.ces.2008.04.043
Article Google Scholar
An SQ, Lu T, Ma Y (2010) Simple adaptive control for siso nonlinear systems using neural network based on genetic algorithm. In: Proceedings of the ninth international conference on machine learning and cybernetics IEEE, Qingdao, China
Angelov P (2011) Fuzzily connected multimodel systems evolving autonomously from data streams. IEEE Trans Syst Man Cybern Part B Cybern 41(4):898–910. doi:10.1109/TSMCB.2010.2098866
Article Google Scholar
Bebis G, Georgiopoulos M (1994) Feed-forward neural networks: why network size is so important. IEEE Potentials 13(4):27–31
Article Google Scholar
Biao L, Qing-chun L, Zhen-hua J, Sheng-fang N (2009) System identification of locomotive diesel engines with autoregressive neural network. In: ICIEA, IEEE, Xi’an, China. doi:10.1109/ICIEA.2009.5138836
Castañeda C, Loukianov A, Sanchez E, Castillo-Toledo B (2013) Real-time torque control using discrete-time recurrent high-order neural networks. Neural Comput Appl 22:1223–1232. doi:10.1007/s00521-012-0890-9
Article Google Scholar
Chen R (2011) Reducing network and computation complexities in neural based real-time scheduling scheme. Appl Math Comput 217(13):6379–6389. doi:10.1016/j.amc.2011.01.014
Article MathSciNet MATH Google Scholar
Cichocki A, Unbehauen R (1993) Neural networks for optimization and signal processing, 1st edn. John Wiley and Sons Ltd, Baffins Lane, Chichester, West Sussex PO19 IUD, England
Coelho L, Wicthoff M (2009) Nonlinear identification using a b-spline neural network and chaotic immune approaches. Mech Syst Signal Process 23(8):2418–2434. doi:10.1016/j.ymssp.2009.01.013
Article Google Scholar
Curteanu S, Cartwright H (2011) Neural networks applied in chemistry. I. Determination of the optimal topology of multilayer perceptron neural networks. J Chemom 25:527–549. doi:10.1002/cem.1401
Article Google Scholar
de Jesus Rubio J (2014) Fuzzy slopes model of nonlinear systems with sparse data. Soft Comput. doi:10.1007/s00500-014-1289-6
Google Scholar
de Jesus Rubio J (2014) Evolving intelligent algorithms for the modelling of brain and eye signals. Appl Soft Comput 14(part B):259–268. doi:10.1016/j.asoc.2013.07.023
Google Scholar
Endisch C, Stolze P, Endisch P, Hackl C, Kennel R (2009) Levenberg–Marquardt-based obs algorithm using adaptive pruning interval for system identification with dynamic neural networks. In: International conference on systems, man, and cybernetics, IEEE, San Antonio, Texas, USA
Farivar F, Shoorehdeli MA, Teshnehlab M (2012) An interdisciplinary overview and intelligent control of human prosthetic eye movements system for the emotional support by a huggable pet-type robot from a biomechatronical viewpoint. J Frankl Inst 347(7):2243–2267. doi:10.1016/j.jfranklin.2011.04.014
Article MathSciNet Google Scholar
Ge H, Du W, Qian F, Liang Y (2009) Identification and control of nonlinear systems by a time-delay recurrent neural network. Neurocomputing 72:2857–2864. doi:10.1016/j.neucom.2008.06.030
Article Google Scholar
Ge H, Qian F, Liang Y, Du W, Wang L (2008) Identification and control of nonlinear systems by a dissimilation particle swarm optimization-based elman neural network. Nonlinear Anal Real World Appl 9(4):1345–1360. doi:10.1016/j.nonrwa.2007.03.008
Article MathSciNet MATH Google Scholar
Goh CK, Teoh EJ, Tan KC (2008) Hybrid multiobjective evolutionary design for artificial neural networks. IEEE Trans Neural Netw 19(9):1531–1548
Article Google Scholar
Han X, Xie W, Fu Z, Luo W (2011) Nonlinear systems identification using dynamic multi-time scale neural networks. Neurocomputing 74(17):3428–3439
Article Google Scholar
Hangos K, Bokor J, Szederknyi G (2004) Analysis and control of nonlinear process systems. Springer, Berlin
MATH Google Scholar
Hsu CF (2009) Adaptive recurrent neural network control using a structure adaptation algorithm. Neural Comput Appl 18:115–125. doi:10.1007/s00521-007-0164-0
Article Google Scholar
Isermann R, Munchhof M (2011) Identification of dynamic systems. An introduction with applications. Springer, Berlin
Book Google Scholar
de Jesus Rubio J, Pérez Cruz JH (2014) Evolving intelligent system for the modelling of nonlinear systems with dead-zone input. Appl Soft Comput 14(Part B):289–304. doi:10.1016/j.asoc.2013.03.018
Article Google Scholar
Khalaj G, Yoozbashizadeh H, Khodabandeh A, Nazari A (2013) Artificial neural network to predict the effect of heat treatments on vickers microhardness of low-carbon nb microalloyed steels. Neural Comput Appl 22(5):879–888. doi:10.1007/s00521-011-0779-z
Leite D, Costa P, Gomide F (2013) Evolving granular neural networks from fuzzy data streams. Neural Netw 38:1–16. doi:10.1016/j.neunet.2012.10.006
Article Google Scholar
Lemos A, Caminhas W, Gomide F (2011) Multivariable gaussian evolving fuzzy modeling system. IEEE Trans Fuzzy Syst 19(1):91–104. doi:10.1109/TFUZZ.2010.2087381
Article Google Scholar
Ljung L (1999) System identification theory for the user. PTR Prentice Hall, Upper Saddle River, NJ 07458
Loghmanian S, Jamaluddin H, Ahmad R, Yusof R, Khalid M (2012) Structure optimization of neural network for dynamic system modeling using multi-objective genetic algorithm. Neural Comput Appl 21(6):1281–1295. doi:10.1007/s00521-011-0560-3
Lughofer E (2013) On-line assurance of interpretability criteria in evolving fuzzy systems. Achievements, new concepts and open issues. Inf Sci 251:22–46. doi:10.1016/j.ins.2013.07.002
Article Google Scholar
Majhi B, Panda G (2011) Robust identification of nonlinear complex systems using low complexity ANN and particle swarm optimization technique. Expert Syst Appl 38(1):321–333. doi:10.1016/j.eswa.2010.06.070
Article Google Scholar
Noorgard M, Ravn O, Poulsen NK, Hansen LK (2000) Neural networks for modelling and control of dynamic systems, 1st edn. Springer, Berlin
Book Google Scholar
Ordon̂ez FJ, Iglesias JA, de Toledo P, Ledezma A, Sanchis A (2013) Online activity recognition using evolving classifiers. Expert Syst Appl 40:1248–1255. doi:10.1016/j.eswa.2012.08.066
Article Google Scholar
Peralta-Donate J, Li X, Gutierrez-Sanchez G, Sanchis de Miguel A (2013) Time series forecasting by evolving artificial neural networks with genetic algorithms, differential evolution and estimation of distribution algorithm. Neural Comput Appl 22:11–20. doi:10.1007/s00521-011-0741-0
Article Google Scholar
Paduart J, Lauwers L, Pintelon R, Schoukens J (2009) Identification of a wiener-hammerstein system using the polynomial nonlinear state space approach. In: Proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, pp 1080–1085
Petre E, Selisteanu D, Sendrescu D, Ionete C (2010) Neural networks-based adaptive control for a class of nonlinear bioprocesses. Neural Comput Appl 19:169–178. doi:10.1007/s00521-009-0284-9
Article Google Scholar
Pratama M, Anavatti SG, Angelov PP, Lughofer E (2014) PANFIS: a novel incremental learning machine. IEEE Trans Neural Netw Learn Syst 25(1):55–68. doi:10.1109/TNNLS.2013.2271933
Article Google Scholar
Romero-Ugalde HM, Carmona JC, Alvarado VM, Reyes-Reyes J (2013) Neural network design and model reduction approach for black box nonlinear system identification with reduced number of parameters. Neurocomputing 101:170–180. doi:10.1016/j.neucom.2012.08.013
Article Google Scholar
Sahnoun MA, Ugalde HMR, Carmona JC, Gomand J (2013) Maximum power point tracking using p&o control optimized by a neural network approach: a good compromise between accuracy and complexity. Energy Procedia 42:650–659. doi:10.1016/j.egypro.2013.11.067
Article Google Scholar
Sayah S, Hamouda A (2013) A hybrid differential evolution algorithm based on particle swarm optimization for nonconvex economic dispatch problems. Appl Soft Comput 13:1608–1619. doi:10.1016/j.asoc.2012.12.014
Article Google Scholar
Schoukens J, Suykens J, Ljung L (2009) Wiener-Hammerstein benchmark. In: Proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, pp 1086–1091
Subudhi B, Jenab D (2011) A differential evolution based neural network approach to nonlinear system identification. Appl Soft Comput 11(1):861–871. doi:10.1016/j.asoc.2010.01.006
Article Google Scholar
Tzeng S (2010) Design of fuzzy wavelet neural networks using the GA approach for function approximation and system identification. Fuzzy Sets Syst 161(19):2585–2596. doi:10.1016/j.fss.2010.06.002
Article MathSciNet Google Scholar
Van Mulders A, Schoukens J, Volckaert M, Diehl M (2009) Two nonlinear optimization methods for black box identification compared. In: Proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, pp 1086–1091
Wang X, Syrmos V (2007) Nonlinear system identification and fault detection using hierarchical clustering analysis and local linear models. In: Mediterranean conference on control and automation, Greece, Athens, pp 1–6
Witters M, Swevers J (2010) Black-box model identification for a continuously variable, electro-hydraulic semi-active damper. Mech Syst Signal Process 24(1):4–18. doi:10.1016/j.ymssp.2009.03.013
Article Google Scholar
Xie W, Zhu Y, Zhao Z, Wong Y (2009) Nonlinear system identification using optimized dynamic neural network. Neurocomputing 72(13–15):3277–3287. doi:10.1016/j.neucom.2009.02.004
Article Google Scholar
Yan Z, Xiuxia L, Peng Y, Zengqiang C, Zhuzhi Y (2009) Modeling and control of nonlinear discrete-time systems based on compound neural networks. Chin J Chem Eng 17(3):454–459. doi:10.1016/S1004-9541(08
Article Google Scholar
Yu W (2006) Multiple recurrent neural networks for stable adaptive control. Neurocomputing 70(1–3):430–444. doi:10.1016/j.neucom.2005.12.122
Article Google Scholar
Yu W, Li X (2004) Fuzzy identification using fuzzy neural networks with stable learning algorithms. IEEE Trans Fuzzy Syst 12(3):411–420. doi:10.1109/TFUZZ.2004.825067
Article Google Scholar
Yu W, Morales A (2004) Gasoline blending system modeling via static and dynamic neural networks. Int J Model Simul 24(3):151–160
Google Scholar
Yu W, Rodriguez FO, Moreno-Armendariz MA (2008) Hierarchical fuzzy CMAC for nonlinear systems modeling. IEEE Trans Fuzzy Syst 16(5):1302–1314. doi:10.1109/TFUZZ.2008.926579
Article Google Scholar
Zhang H, Wu W, Yao M (2012) Boundedness and convergence of batch back-propagation algorithm with penalty for feedforward neural networks. Neurocomputing 89:141–146. doi:10.1016/j.neucom.2012.02.029
Article Google Scholar
Zhang J, Zhu Q, Wu X, Li Y (2013) A generalized indirect adaptive neural networks backstepping control procedure for a class of non-affine nonlinear systems with pure-feedback prototype. Neurocomputing 21(9):131–139. doi:10.1016/j.neucom.2013.04.015
Google Scholar
Zhang Z, Qiao J (2010) A node pruning algorithm for feedforward neural network based on neural complexity. In: International conference on intelligent control and information processing. IEEE, Dalian, China, pp 406–410
Zhao H, Zeng X, He Z (2011) Low-complexity nonlinear adaptive filter based on a pipelined bilinear recurrent neural network. IEEE Trans Neural Netw 22(9):1494–1507
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Traitement du Signal et de l’Image, LTSI, Université de Rennes 1, INSERM U1099, 35042, Rennes, France
Hector M. Romero Ugalde
Laboratoire des Sciences de l’Information et des Systemes, UMR CNRS 7296, ENSAM, 13100, Aix en Provence, France
Jean-Claude Carmona
Centro Nacional de Investigacion y Desarrollo Tecnologico, CENIDET, 62490, Cuernavaca, Morelos, Mexico
Juan Reyes-Reyes & Victor M. Alvarado
LASPI, F-42334 IUT de Roanne, Université de Saint Etienne, Jean Monnet, 42334, Roanne, France
Christophe Corbier

Authors

Hector M. Romero Ugalde
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Claude Carmona
View author publications
You can also search for this author in PubMed Google Scholar
Juan Reyes-Reyes
View author publications
You can also search for this author in PubMed Google Scholar
Victor M. Alvarado
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Corbier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hector M. Romero Ugalde.

Appendices

Appendix 1: Training

The adaptation laws of the synaptic weights, derived according to the steepest descent gradient, are as follows:

1.1 FF1 adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) T \end{aligned}$$

(18)

$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X r_b \end{aligned}$$

(19)

$$\begin{aligned} Z_a(k+1)=Z_a(k)+ \eta e(k) X r_a \end{aligned}$$

(20)

$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X \end{aligned}$$

(21)

$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k)+\eta e(k) X Z_b \tanh (J_{u}W_{b_{i}}) \end{aligned}$$

(22)

$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X Z_a \tanh (J_{\hat{y}}W_{a_{i}}) \end{aligned}$$

(23)

$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X Z_b V_{b_i} {\mathrm{sech}}^2(J_{u}W_{b_i}) J_u \end{aligned}$$

(24)

$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X Z_a V_{a_i} {\mathrm{sech}}^2(J_{\hat{y}}W_{a_i}) J_{\hat{y}} \end{aligned}$$

(25)

1.2 FF2 adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) T \end{aligned}$$

(26)

$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X \tanh (r_b) \end{aligned}$$

(27)

$$\begin{aligned} Z_a(k+1)=Z_a(k)+\eta e(k) X \tanh (r_a) \end{aligned}$$

(28)

$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X \end{aligned}$$

(29)

$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k)+\eta e(k) X Z_b {\mathrm{sech}}^2(r_b) J_{u} W_{b_{i}} \end{aligned}$$

(30)

$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X Z_a {\mathrm{sech}}^2(r_a) J_{\hat{y}} W_{a_{i}} \end{aligned}$$

(31)

$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X Z_b {\mathrm{sech}}^2(r_b) V_{b_i} J_u \end{aligned}$$

(32)

$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X Z_a {\mathrm{sech}}^2(r_a) V_{a_i} J_{\hat{y}} \end{aligned}$$

(33)

1.3 ARX adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) T \end{aligned}$$

(34)

$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X r_b \end{aligned}$$

(35)

$$\begin{aligned} Z_a(k+1)=Z_a(k)+\eta e(k) X r_a \end{aligned}$$

(36)

$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X \end{aligned}$$

(37)

$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k) + \eta e(k) X Z_b J_{u} W_{b_{i}} \end{aligned}$$

(38)

$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X Z_a J_{\hat{y}} W_{a_{i}} \end{aligned}$$

(39)

$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X Z_b V_{b_i} J_u \end{aligned}$$

(40)

$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X Z_a V_{a_i} J_{\hat{y}} \end{aligned}$$

(41)

1.4 NARX adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) \tanh (T) \end{aligned}$$

(42)

$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X {\mathrm{sech}}^2(T) r_b \end{aligned}$$

(43)

$$\begin{aligned} Z_a(k+1)=Z_a(k)+\eta e(k) X {\mathrm{sech}}^2(T) r_a \end{aligned}$$

(44)

$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X {\mathrm{sech}}^2(T) \end{aligned}$$

(45)

$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k) + \eta e(k) X {\mathrm{sech}}^2(T) Z_b J_{u} W_{b_{i}} \end{aligned}$$

(46)

$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X {\mathrm{sech}}^2(T) Z_a J_{\hat{y}} W_{a_{i}} \end{aligned}$$

(47)

$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X {\mathrm{sech}}^2(T) Z_b V_{b_i} J_u \end{aligned}$$

(48)

$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X {\mathrm{sech}}^2(T) Z_a V_{a_i} J_{\hat{y}} \end{aligned}$$

(49)

with $\eta$, commonly referred to as learning rate, adapted according to the “search then convergence” algorithm presented in [8].

Appendix 2: Reduction methods applied to models FF2, ARX and NARX

Let us follow the proposed system identification procedure:

1.1 Model FF2

Step 1. Neural network training under particular assumptions Once the neural network model given by (4) is trained under Assumptions 1 and 2, we obtain:

$$\begin{aligned}&\hat{y}(k)=X^* T \nonumber\\&T=Z_b^* \varphi _2(r_b)+Z_a^* \varphi _2(r_a)+Z_h^*\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}^*(J_{u}W_{b_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}^*(J_{\hat{y}}W_{a_i}^*)} \end{aligned}$$

(50)

Step 2. Model transformation Since the final values of the synaptic weights are: $W^*_{b_1}=W^*_{b_{j}},\,V^*_{b_1}=V^*_{b_{j}},\,W^*_{a_1}=W^*_{a_{j}}$ and $V^*_{a_1}=V^*_{a_{j}}$ with $j=2,3,\ldots ,nn$. Then, in (50), it is indeed possible to make the following algebraic operations:

$$\begin{aligned}&Z_B^*=X^*\times Z_b^*\\&Z_A^*=X^*\times Z_a^*\\&Z_H^*=X^*\times Z_h^*\\&W_{B_i}^*=V_{b_i}^*\times W_{b_i}^*\\&W_{A_i}^*=V_{a_i}^*\times W_{a_i}^*\\ \end{aligned}$$

Then, the 2nn-2-1 neuro-model given by (50) becomes:

$$\begin{aligned}&\hat{y}(k)=T\nonumber\\ &T=Z_B^*\varphi _2(r_b)+Z_A^*\varphi _2(r_a)+Z_H^* \nonumber \\ &r_b=\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}\end{aligned}$$

(51)

where $W^*_{B_1}=W^*_{B_{j}}$ and $W^*_{A_1}=W^*_{A_{j}}$ (with $j=2,3,\ldots ,nn$) due to Assumption 2.

A supplementary transformation is achieved in order to change the redefined 2nn-1 model (51) into a 2-1 representation by the following algebraic operations:

$$\begin{aligned}&\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}=nn \times (J_{u}W_{B_1}^*)=J_{u}W_{B}^*\\&\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}= nn \times (J_{y_{n}}W_{A_1}^*)=J_{y_{n}}W_{A}^* \end{aligned}$$

The resulting model after the model transformation has the following mathematical form:

$$\begin{aligned} \hat{y}(k)=Z_{B}^*\varphi _2(J_{u}W_B^*)+ Z_{A}^*\varphi _2(J_{\hat{y}}W_A^*)+Z_H^* \end{aligned}$$

(52)

1.2 Model ARX

Step 1. Neural network training under particular assumptions Once the neural network model given by (5) is trained under Assumptions 1 and 2, we obtain

$$\begin{aligned}&\hat{y}(k)=X^* T\nonumber\\&T=Z_b^* (r_b)+Z_a^* (r_a)+Z_h^*\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}^*(J_{u}W_{b_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}^*(J_{\hat{y}}W_{a_i}^*)} \end{aligned}$$

(53)

Step 2. Model transformation Since the final values of the synaptic weights are: $W^*_{b_1}=W^*_{b_{j}},\,V^*_{b_1}=V^*_{b_{j}},\,W^*_{a_1}=W^*_{a_{j}}$ and $V^*_{a_1}=V^*_{a_{j}}$ with $j=2,3,\ldots ,nn$. Then, in (53), it is indeed possible to make the following algebraic operations:

$$\begin{aligned}&Z_{H}^*=X^*\times Z_h^*\\&W_{B_i}^*=X^*\times Z_b^* \times V_{b_i}^*\times W_{b_i}^*\\&W_{A_i}^*=X^*\times Z_a^* \times V_{a_i}^*\times W_{a_i}^* \end{aligned}$$

Then, the 2nn-2-1 neuro-model given by (53) becomes:

$$\begin{aligned}&\hat{y}(k)=T \nonumber \\ &T=r_b+r_a+Z_H^* \nonumber \\ &r_b=\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}\nonumber \\ &r_a=\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}\end{aligned}$$

(54)

where $W^*_{B_1}=W^*_{B_{j}}$ and $W^*_{A_1}=W^*_{A_{j}}$ (with $j=2,3,\ldots ,nn$) due to Assumption 2.

A supplementary transformation is achieved in order to change the redefined 2nn-1 model (54) into a 2-1 representation by the following algebraic operations:

$$\begin{aligned}&\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}=nn \times (J_{u}W_{B_1}^*)=J_{u}W_B^*\\&\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}= nn \times (J_{\hat{y}}W_{A_1}^*)=J_{\hat{y}}W_A^* \end{aligned}$$

The resulting model after the model transformation has the following mathematical form:

$$\begin{aligned} \hat{y}(k)=(J_{u}W_B^*)+ (J_{\hat{y}}W_A^*)+Z_H^* \end{aligned}$$

(55)

1.3 Model NARX

Step 1. Neural network training under particular assumptions Once the neural network model given by (6) is trained under Assumptions 1 and 2, we obtain:

$$\begin{aligned}&\hat{y}(k)=X^* \varphi _{3}(T)\nonumber\\&T=Z_b^* (r_b)+Z_a^* (r_a)+Z_h^*\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}^*(J_{u}W_{b_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}^*(J_{\hat{y}}W_{a_i}^*)} \end{aligned}$$

(56)

Step 2. Model transformation Since the final values of the synaptic weights are: $W^*_{b_1}=W^*_{b_{j}},\,V^*_{b_1}=V^*_{b_{j}},\,W^*_{a_1}=W^*_{a_{j}}$ and $V^*_{a_1}=V^*_{a_{j}}$ with $j=2,3,\ldots ,nn$.

Then, in (56), it is indeed possible to make the following algebraic operations:

$$\begin{aligned}&W_{B_i}^*=Z_b^* \times V_{b_i}^*\times W_{b_i}^*\\&W_{A_i}^*=Z_a^* \times V_{a_i}^*\times W_{a_i}^* \end{aligned}$$

Then, the 2nn-2-1 neuro-model given by (56) becomes:

$$\begin{aligned}&\hat{y}(k)=X^* \varphi _{3}(T) \nonumber\\ &T=r_b+r_a+Z_h^* \nonumber \\ &r_b=\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}\nonumber \\ &r_a=\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}\end{aligned}$$

(57)

A supplementary transformation is achieved in order to change the redefined 2nn-1 model (57) into a 2-1 representation by the following algebraic operations:

$$\begin{aligned}&\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}=nn \times (J_{u}W_{B_1}^*)=J_{u}W_B^*\\&\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}= nn \times (J_{y_{n}}W_{A_1}^*)=J_{\hat{y}}W_A^* \end{aligned}$$

The resulting model after the model transformation has the following mathematical form:

$$\begin{aligned} \hat{y}(k)=X^* \varphi _{3}((J_{u}W_B^*)+ (J_{\hat{y}}W_A^*)+Z_h^*) \end{aligned}$$

(58)

Appendix 2: Neuro-identification of a flexible robot arm

In order to show the flexibility of the proposed model families, a second system is identified. The data come from a flexible robot arm (see Fig. 9), and the arm is installed on an electrical motor. The applied persisting exciting input, corresponding to the reaction torque of the structure on the ground, is a periodic sine sweep. The output of the system is the acceleration of the flexible arm. This system identification case is issued from an example through DaISy (database for the identification of systems). http://homes.esat.kuleuven.be/~smc/daisy/daisydata.html.

The system is identified with the complex 2nn-2-1 neuro-model NARX (683 parameters before reduction) given by (59) with $nn=40,\,n_a=8$ and $n_b=7$. Notice that this neuro-model is derived from the proposed neural network architecture.

$$\begin{aligned}&\hat{y}(k)=X\tanh (T)\nonumber\\&T=Z_b(r_b)+Z_a(r_a)\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}\left( J_{u}W_{b_{i}}\right) }\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}\left( J_{\hat{y}}W_{a_i}\right) }\end{aligned}$$

(59)

Once the model (59) is trained under Assumptions 1 and 2, Theorem 1 (reduction approach) is applied in order to generate a reduced model of the form of (60).proposed approach has to be compared

$$\begin{aligned} \hat{y}(k)=X^*\tanh \left( J_{u}W_{B}^*+J_{\hat{y}}W_{A}^*\right) \end{aligned}$$

(60)

where the 16 parameters characterizing the system are as follows:

Proposed NARX model:

$$\begin{aligned}&W^*_B = [0.2883 \quad {-}0.3820 \quad 0.0655 \quad 0.1339 \quad {-}0.0821 \quad {-}0.2135 \quad 0.2335]\\&W^*_A = [{-}0.7387 \quad {-}0.0914 \quad 0.3221 \quad 0.3122 \quad {-}0.0812 \quad {-}0.3016 \quad {-}0.0196 \quad 0.3362]\\&X^*={-}1.4623\\ \end{aligned}$$

Naturally, we have to compare our approach with another black-box system identification method. For example, the same system is identified with the nonlinear system identification toolbox in MATLAB 2013. Here, the identification model used is the NLARX model, and the nonlinearity estimator is the sigmoid network with 5 units, $n_a= 6$ and $n_b=10$. The $619$ parameters of the MATLAB model are estimated by using the Levenberg–Marquardt learning algorithm. The comparison, realized in terms of the number of parameters $np$ and the validation results proposed in SYSID 2009 [39], is summarized in Table 3.

In order to compare the performances of both models, the frequency response function (FRF) that is mainly appreciated by the users in practical applications is computed from the measured data and the estimated output of both, the proposed neuro-model NARX and the NLARX model obtained by using the toolbox of MATLAB (see Fig. 10).

Table 3 A comparative assessment with respect to another parametric method

Full size table

1.1 Comments

Figure 10 exposes the interest of using neural networks, since the estimated models accurately represent the system behavior in a large frequency range.

Confronting the two black-box models (see Table 3), the deviation errors ($s_t$ and $e_{{\mathrm{RMS}}e}$) of the MATLAB model are only two times better, but the number of parameters is substantially larger ($619$ vs $16$). Conversely, in terms of $\mu _t$, the two models have almost the same accuracy, despite the use of a simple gradient training algorithm in the computation of the proposed NARX model. Remember that the main objective in this paper is to propose to the user a rather simple and efficient way to find a good balance between accuracy and complexity.

Appendix 3 :Neuro-identification of an acoustic duct

In order to show the flexibility of the proposed model families, a third system is identified. This experimental device is an acoustic waves guide made of Plexiglas, used to develop an active noise control (see Fig. 11). One end of the duct is almost anechoic and the other end is opened. The identification input signal is a pseudo-random binary sequence (PRBS) with a length $L = 2^{10} - 1$ and level $\pm$3 V sufficiently exciting applied to the control loudspeaker. The sampling period is TS = 500 $\upmu$s. In order to model, the first propagative modes of the waves guide (also called the secondary path) which lye in the frequency range $[0; 1{,}000\,\mathrm{Hz}]$. We shall use several data set of the same length, namely 1,024, measured by the output microphone. A prior measurement of the propagation delay confirms the analytical value $\tau \approx 7TS$.

This system is identified with the complex 2nn-2-1 neuro-model FF1 (1,418 parameters before reduction) given by (61) with $nn=40,\,n_a=15$ and $n_b=18$.

$$\begin{aligned}&\hat{y}(k)=X T\nonumber\\&T=Z_b r_b+Z_a r_a+Z_h\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}\tanh \left( J_{u}W_{b_{i}}\right) }\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}\tanh \left( J_{\hat{y}}W_{a_i}\right) } \end{aligned}$$

(61)

Once the model is trained under Assumptions 1 and 2, Theorem 1 (reduction approach) is applied in order to generate a reduced model of the form of (62).

$$\begin{aligned} \hat{y}(k)=V_{B}^*\tanh (J_{u}W_{B}^*)+ V_{A}^*\tanh \left( J_{\hat{y}}W_{A}^*\right) \end{aligned}$$

(62)

where the 35 parameters characterizing the system are as follows:

$$\begin{aligned} W^*_A &= \left[ {-}0.0828 \quad 0.0614 \quad{-}0.0718 \quad {-}0.1201 \quad {-}0.0043 \quad {-}0.0067 \quad {-}0.0530\quad {-}0.0474\right. \\&\left. \quad 0.0032 \quad {-}0.0173 \quad{-}0.0381 \quad {-}0.0135 \quad {-}0.0062 \quad {-}0.0151 \quad {-}0.0140\right] \\W^*_B&= \left[ {-}0.0325 \quad {-}0.0555 \quad 0.0256 \quad0.0393 \quad 0.0007 \quad 0.0274 \quad 0.0132 \quad 0.0202 \quad0.0138\right. \\&\left.\quad 0.0006 \quad 0.0059 \quad {-}0.0139 \quad0.0040 \quad {-}0.0188 \quad {-}0.0110 \quad {-}0.0120 \quad {-}0.0002 \quad{-}0.0058\right] \end{aligned}$$

$V^*_A=6.8643$ and $V^*_B=3.4414$

Naturally, the proposed approach has to be compared with another black-box system identification technique. Therefore, the same acoustic system is identified with the nonlinear system identification toolbox in MATLAB 2013. Here, the identification model used is the NLARX model, and the nonlinearity estimator is the sigmoid network with 5 units, $n_a= 14$ and $n_b=12$. The 1,519 parameters of the MATLAB model are estimated by using the Levenberg–Marquardt learning algorithm. The comparison, realized in terms of the number of parameters np and the validation results proposed in SYSID 2009 [39], is summarized in Table 4.

In order to compare the performances of both models in the frequency domain, the frequency response function (FRF) that is mainly appreciated by the users in practical applications is computed from the measured data and the estimated output of both, the proposed neuro-model NARX and the NLARX model obtained by using the toolbox of MATLAB (see Fig. 12).

Table 4 A comparative assessment with respect to another parametric method

Full size table

1.1 Comments

Figure 12 illustrates the relevant performance of both, the proposed neuro-model FF1 and the MATLAB model, with respect to the real system behavior. But, if the number of model parameters is taken into consideration, it seems remarkable that with only 35 parameters the proposed model provides a significant level of accuracy.

From Table 4, comparing the proposed model (FF1) and the model obtained by using the toolbox of MATLAB both models almost have the same accuracy, but once again, the number of parameters of the MATLAB model is substantially larger (1,519 vs 35). Then, we are proposing to the user a system identification approach with results comparable to the toolbox of MATLAB, but with a smaller number of parameters.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Romero Ugalde, H.M., Carmona, JC., Reyes-Reyes, J. et al. Balanced simplicity–accuracy neural network model families for system identification. Neural Comput & Applic 26, 171–186 (2015). https://doi.org/10.1007/s00521-014-1716-8

Download citation

Received: 05 May 2014
Accepted: 11 September 2014
Published: 25 September 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s00521-014-1716-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Balanced simplicity–accuracy neural network model families for system identification

Abstract

Access this article

Similar content being viewed by others