Skip to main content
Log in

Balanced simplicity–accuracy neural network model families for system identification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Nonlinear system identification tends to provide highly accurate models these last decades; however, the user remains interested in finding a good balance between high-accuracy models and moderate complexity. In this paper, four balanced accuracy–complexity identification model families are proposed. These models are derived, by selecting different combinations of activation functions in a dedicated neural network design presented in our previous work (Romero-Ugalde et al. in Neurocomputing 101:170–180. doi:10.1016/j.neucom.2012.08.013, 2013). The neural network, based on a recurrent three-layer architecture, helps to reduce the number of parameters of the model after the training phase without any loss of estimation accuracy. Even if this reduction is achieved by a convenient choice of the activation functions and the initial conditions of the synaptic weights, it nevertheless leads to a wide range of models among the most encountered in the literature. To validate the proposed approach, three different systems are identified: The first one corresponds to the unavoidable Wiener–Hammerstein system proposed in SYSID2009 as a benchmark; the second system is a flexible robot arm; and the third system corresponds to an acoustic duct.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

\(J_{u} \in R^{1\times n_b}\) :

Input regressor vector

\(J_{\hat{y}} \in R^{1\times n_a}\) :

Output regressor vector

\(n_a \in R^{1\times 1}\) :

Number of pass outputs of the system

\(n_b \in R^{1\times 1}\) :

Number of pass inputs of the system

\(X \in R^{1\times 1}\) :

Synaptic weight

\(Z_{b} \in R^{1\times 1}\) :

Synaptic weight

\(Z_{a} \in R^{1\times 1}\) :

Synaptic weight

\(V_{b_i} \in R^{1\times 1}\) :

Synaptic weight

\(V_{a_i} \in R^{1\times 1}\) :

Synaptic weight

\(Z_h \in R^{1\times 1}\) :

Synaptic weight

\(W_{b_{i}} \in R^{1\times n_b}\) :

Synaptic weight

\(W_{a_i} \in R^{1\times n_a}\) :

Synaptic weight

\(W_{B} \in R^{1\times n_b}\) :

Synaptic weight

\(W_{A} \in R^{1\times n_a}\) :

Synaptic weight

\(V_{B} \in R^{1\times 1}\) :

Synaptic weight

\(V_{A} \in R^{1\times 1}\) :

Synaptic weight

\(Z_H \in R^{1\times 1}\) :

Synaptic weight

\(X^* \in R^{1\times 1}\) :

Synaptic weight after training

\(Z_{b}^* \in R^{1\times 1}\) :

Synaptic weight after training

\(Z_{a}^* \in R^{1\times 1}\) :

Synaptic weight after training

\(V_{b_i}^* \in R^{1\times 1}\) :

Synaptic weight after training

\(V_{a_i}^* \in R^{1\times 1}\) :

Synaptic weight after training

\(Z_h^* \in R^{1\times 1}\) :

Synaptic weight after training

\(W_{b_{i}}^* \in R^{1\times n_b}\) :

Synaptic weight after training

\(W_{a_i}^* \in R^{1\times n_a}\) :

Synaptic weight after training

\(W_{B}^* \in R^{1\times n_b}\) :

Synaptic weight after training

\(W_{A}^* \in R^{1\times n_a}\) :

Synaptic weight after training

\(V_{B}^* \in R^{1\times 1}\) :

Synaptic weight after training

\(V_{A}^* \in R^{1\times 1}\) :

Synaptic weight after training

\(Z_H^* \in R^{1\times 1}\) :

Synaptic weight after training

\(\varphi _{1}\) :

Activation function (linear or nonlinear)

\(\varphi _{2}\) :

Activation function (linear or nonlinear)

\(\varphi _{3}\) :

Activation function (linear or nonlinear)

\(nn \in R^{1\times 1}\) :

Number of neurons

\(e_{{\mathrm{sim}}}\) :

Simulation error

\(\mu _t\) :

Mean value of the simulation error

\(s_t\) :

Standard deviation of the error

\(e_{{\mathrm{RMS}}t}\) :

Root mean square (RMS) of the error

\(u\) :

Input of the neural network

\(\hat{y}\) :

Output of the neural network

References

  1. Aadaleesan P, Miglan N, Sharma R, Saha P (2008) Nonlinear system identification using Wiener type Laguerre–Wavelet network model. Chem Eng Sci 63(15):3932–3941. doi:10.1016/j.ces.2008.04.043

    Article  Google Scholar 

  2. An SQ, Lu T, Ma Y (2010) Simple adaptive control for siso nonlinear systems using neural network based on genetic algorithm. In: Proceedings of the ninth international conference on machine learning and cybernetics IEEE, Qingdao, China

  3. Angelov P (2011) Fuzzily connected multimodel systems evolving autonomously from data streams. IEEE Trans Syst Man Cybern Part B Cybern 41(4):898–910. doi:10.1109/TSMCB.2010.2098866

    Article  Google Scholar 

  4. Bebis G, Georgiopoulos M (1994) Feed-forward neural networks: why network size is so important. IEEE Potentials 13(4):27–31

    Article  Google Scholar 

  5. Biao L, Qing-chun L, Zhen-hua J, Sheng-fang N (2009) System identification of locomotive diesel engines with autoregressive neural network. In: ICIEA, IEEE, Xi’an, China. doi:10.1109/ICIEA.2009.5138836

  6. Castañeda C, Loukianov A, Sanchez E, Castillo-Toledo B (2013) Real-time torque control using discrete-time recurrent high-order neural networks. Neural Comput Appl 22:1223–1232. doi:10.1007/s00521-012-0890-9

    Article  Google Scholar 

  7. Chen R (2011) Reducing network and computation complexities in neural based real-time scheduling scheme. Appl Math Comput 217(13):6379–6389. doi:10.1016/j.amc.2011.01.014

    Article  MathSciNet  MATH  Google Scholar 

  8. Cichocki A, Unbehauen R (1993) Neural networks for optimization and signal processing, 1st edn. John Wiley and Sons Ltd, Baffins Lane, Chichester, West Sussex PO19 IUD, England

  9. Coelho L, Wicthoff M (2009) Nonlinear identification using a b-spline neural network and chaotic immune approaches. Mech Syst Signal Process 23(8):2418–2434. doi:10.1016/j.ymssp.2009.01.013

    Article  Google Scholar 

  10. Curteanu S, Cartwright H (2011) Neural networks applied in chemistry. I. Determination of the optimal topology of multilayer perceptron neural networks. J Chemom 25:527–549. doi:10.1002/cem.1401

    Article  Google Scholar 

  11. de Jesus Rubio J (2014) Fuzzy slopes model of nonlinear systems with sparse data. Soft Comput. doi:10.1007/s00500-014-1289-6

    Google Scholar 

  12. de Jesus Rubio J (2014) Evolving intelligent algorithms for the modelling of brain and eye signals. Appl Soft Comput 14(part B):259–268. doi:10.1016/j.asoc.2013.07.023

    Google Scholar 

  13. Endisch C, Stolze P, Endisch P, Hackl C, Kennel R (2009) Levenberg–Marquardt-based obs algorithm using adaptive pruning interval for system identification with dynamic neural networks. In: International conference on systems, man, and cybernetics, IEEE, San Antonio, Texas, USA

  14. Farivar F, Shoorehdeli MA, Teshnehlab M (2012) An interdisciplinary overview and intelligent control of human prosthetic eye movements system for the emotional support by a huggable pet-type robot from a biomechatronical viewpoint. J Frankl Inst 347(7):2243–2267. doi:10.1016/j.jfranklin.2011.04.014

    Article  MathSciNet  Google Scholar 

  15. Ge H, Du W, Qian F, Liang Y (2009) Identification and control of nonlinear systems by a time-delay recurrent neural network. Neurocomputing 72:2857–2864. doi:10.1016/j.neucom.2008.06.030

    Article  Google Scholar 

  16. Ge H, Qian F, Liang Y, Du W, Wang L (2008) Identification and control of nonlinear systems by a dissimilation particle swarm optimization-based elman neural network. Nonlinear Anal Real World Appl 9(4):1345–1360. doi:10.1016/j.nonrwa.2007.03.008

    Article  MathSciNet  MATH  Google Scholar 

  17. Goh CK, Teoh EJ, Tan KC (2008) Hybrid multiobjective evolutionary design for artificial neural networks. IEEE Trans Neural Netw 19(9):1531–1548

    Article  Google Scholar 

  18. Han X, Xie W, Fu Z, Luo W (2011) Nonlinear systems identification using dynamic multi-time scale neural networks. Neurocomputing 74(17):3428–3439

    Article  Google Scholar 

  19. Hangos K, Bokor J, Szederknyi G (2004) Analysis and control of nonlinear process systems. Springer, Berlin

    MATH  Google Scholar 

  20. Hsu CF (2009) Adaptive recurrent neural network control using a structure adaptation algorithm. Neural Comput Appl 18:115–125. doi:10.1007/s00521-007-0164-0

    Article  Google Scholar 

  21. Isermann R, Munchhof M (2011) Identification of dynamic systems. An introduction with applications. Springer, Berlin

    Book  Google Scholar 

  22. de Jesus Rubio J, Pérez Cruz JH (2014) Evolving intelligent system for the modelling of nonlinear systems with dead-zone input. Appl Soft Comput 14(Part B):289–304. doi:10.1016/j.asoc.2013.03.018

    Article  Google Scholar 

  23. Khalaj G, Yoozbashizadeh H, Khodabandeh A, Nazari A (2013) Artificial neural network to predict the effect of heat treatments on vickers microhardness of low-carbon nb microalloyed steels. Neural Comput Appl 22(5):879–888. doi:10.1007/s00521-011-0779-z

  24. Leite D, Costa P, Gomide F (2013) Evolving granular neural networks from fuzzy data streams. Neural Netw 38:1–16. doi:10.1016/j.neunet.2012.10.006

    Article  Google Scholar 

  25. Lemos A, Caminhas W, Gomide F (2011) Multivariable gaussian evolving fuzzy modeling system. IEEE Trans Fuzzy Syst 19(1):91–104. doi:10.1109/TFUZZ.2010.2087381

    Article  Google Scholar 

  26. Ljung L (1999) System identification theory for the user. PTR Prentice Hall, Upper Saddle River, NJ 07458

  27. Loghmanian S, Jamaluddin H, Ahmad R, Yusof R, Khalid M (2012) Structure optimization of neural network for dynamic system modeling using multi-objective genetic algorithm. Neural Comput Appl 21(6):1281–1295. doi:10.1007/s00521-011-0560-3

  28. Lughofer E (2013) On-line assurance of interpretability criteria in evolving fuzzy systems. Achievements, new concepts and open issues. Inf Sci 251:22–46. doi:10.1016/j.ins.2013.07.002

    Article  Google Scholar 

  29. Majhi B, Panda G (2011) Robust identification of nonlinear complex systems using low complexity ANN and particle swarm optimization technique. Expert Syst Appl 38(1):321–333. doi:10.1016/j.eswa.2010.06.070

    Article  Google Scholar 

  30. Noorgard M, Ravn O, Poulsen NK, Hansen LK (2000) Neural networks for modelling and control of dynamic systems, 1st edn. Springer, Berlin

    Book  Google Scholar 

  31. Ordon̂ez FJ, Iglesias JA, de Toledo P, Ledezma A, Sanchis A (2013) Online activity recognition using evolving classifiers. Expert Syst Appl 40:1248–1255. doi:10.1016/j.eswa.2012.08.066

    Article  Google Scholar 

  32. Peralta-Donate J, Li X, Gutierrez-Sanchez G, Sanchis de Miguel A (2013) Time series forecasting by evolving artificial neural networks with genetic algorithms, differential evolution and estimation of distribution algorithm. Neural Comput Appl 22:11–20. doi:10.1007/s00521-011-0741-0

    Article  Google Scholar 

  33. Paduart J, Lauwers L, Pintelon R, Schoukens J (2009) Identification of a wiener-hammerstein system using the polynomial nonlinear state space approach. In: Proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, pp 1080–1085

  34. Petre E, Selisteanu D, Sendrescu D, Ionete C (2010) Neural networks-based adaptive control for a class of nonlinear bioprocesses. Neural Comput Appl 19:169–178. doi:10.1007/s00521-009-0284-9

    Article  Google Scholar 

  35. Pratama M, Anavatti SG, Angelov PP, Lughofer E (2014) PANFIS: a novel incremental learning machine. IEEE Trans Neural Netw Learn Syst 25(1):55–68. doi:10.1109/TNNLS.2013.2271933

    Article  Google Scholar 

  36. Romero-Ugalde HM, Carmona JC, Alvarado VM, Reyes-Reyes J (2013) Neural network design and model reduction approach for black box nonlinear system identification with reduced number of parameters. Neurocomputing 101:170–180. doi:10.1016/j.neucom.2012.08.013

    Article  Google Scholar 

  37. Sahnoun MA, Ugalde HMR, Carmona JC, Gomand J (2013) Maximum power point tracking using p&o control optimized by a neural network approach: a good compromise between accuracy and complexity. Energy Procedia 42:650–659. doi:10.1016/j.egypro.2013.11.067

    Article  Google Scholar 

  38. Sayah S, Hamouda A (2013) A hybrid differential evolution algorithm based on particle swarm optimization for nonconvex economic dispatch problems. Appl Soft Comput 13:1608–1619. doi:10.1016/j.asoc.2012.12.014

    Article  Google Scholar 

  39. Schoukens J, Suykens J, Ljung L (2009) Wiener-Hammerstein benchmark. In: Proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, pp 1086–1091

  40. Subudhi B, Jenab D (2011) A differential evolution based neural network approach to nonlinear system identification. Appl Soft Comput 11(1):861–871. doi:10.1016/j.asoc.2010.01.006

    Article  Google Scholar 

  41. Tzeng S (2010) Design of fuzzy wavelet neural networks using the GA approach for function approximation and system identification. Fuzzy Sets Syst 161(19):2585–2596. doi:10.1016/j.fss.2010.06.002

    Article  MathSciNet  Google Scholar 

  42. Van Mulders A, Schoukens J, Volckaert M, Diehl M (2009) Two nonlinear optimization methods for black box identification compared. In: Proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, pp 1086–1091

  43. Wang X, Syrmos V (2007) Nonlinear system identification and fault detection using hierarchical clustering analysis and local linear models. In: Mediterranean conference on control and automation, Greece, Athens, pp 1–6

  44. Witters M, Swevers J (2010) Black-box model identification for a continuously variable, electro-hydraulic semi-active damper. Mech Syst Signal Process 24(1):4–18. doi:10.1016/j.ymssp.2009.03.013

    Article  Google Scholar 

  45. Xie W, Zhu Y, Zhao Z, Wong Y (2009) Nonlinear system identification using optimized dynamic neural network. Neurocomputing 72(13–15):3277–3287. doi:10.1016/j.neucom.2009.02.004

    Article  Google Scholar 

  46. Yan Z, Xiuxia L, Peng Y, Zengqiang C, Zhuzhi Y (2009) Modeling and control of nonlinear discrete-time systems based on compound neural networks. Chin J Chem Eng 17(3):454–459. doi:10.1016/S1004-9541(08

    Article  Google Scholar 

  47. Yu W (2006) Multiple recurrent neural networks for stable adaptive control. Neurocomputing 70(1–3):430–444. doi:10.1016/j.neucom.2005.12.122

    Article  Google Scholar 

  48. Yu W, Li X (2004) Fuzzy identification using fuzzy neural networks with stable learning algorithms. IEEE Trans Fuzzy Syst 12(3):411–420. doi:10.1109/TFUZZ.2004.825067

    Article  Google Scholar 

  49. Yu W, Morales A (2004) Gasoline blending system modeling via static and dynamic neural networks. Int J Model Simul 24(3):151–160

    Google Scholar 

  50. Yu W, Rodriguez FO, Moreno-Armendariz MA (2008) Hierarchical fuzzy CMAC for nonlinear systems modeling. IEEE Trans Fuzzy Syst 16(5):1302–1314. doi:10.1109/TFUZZ.2008.926579

    Article  Google Scholar 

  51. Zhang H, Wu W, Yao M (2012) Boundedness and convergence of batch back-propagation algorithm with penalty for feedforward neural networks. Neurocomputing 89:141–146. doi:10.1016/j.neucom.2012.02.029

    Article  Google Scholar 

  52. Zhang J, Zhu Q, Wu X, Li Y (2013) A generalized indirect adaptive neural networks backstepping control procedure for a class of non-affine nonlinear systems with pure-feedback prototype. Neurocomputing 21(9):131–139. doi:10.1016/j.neucom.2013.04.015

    Google Scholar 

  53. Zhang Z, Qiao J (2010) A node pruning algorithm for feedforward neural network based on neural complexity. In: International conference on intelligent control and information processing. IEEE, Dalian, China, pp 406–410

  54. Zhao H, Zeng X, He Z (2011) Low-complexity nonlinear adaptive filter based on a pipelined bilinear recurrent neural network. IEEE Trans Neural Netw 22(9):1494–1507

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hector M. Romero Ugalde.

Appendices

Appendix 1: Training

The adaptation laws of the synaptic weights, derived according to the steepest descent gradient, are as follows:

1.1 FF1 adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) T \end{aligned}$$
(18)
$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X r_b \end{aligned}$$
(19)
$$\begin{aligned} Z_a(k+1)=Z_a(k)+ \eta e(k) X r_a \end{aligned}$$
(20)
$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X \end{aligned}$$
(21)
$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k)+\eta e(k) X Z_b \tanh (J_{u}W_{b_{i}}) \end{aligned}$$
(22)
$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X Z_a \tanh (J_{\hat{y}}W_{a_{i}}) \end{aligned}$$
(23)
$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X Z_b V_{b_i} {\mathrm{sech}}^2(J_{u}W_{b_i}) J_u \end{aligned}$$
(24)
$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X Z_a V_{a_i} {\mathrm{sech}}^2(J_{\hat{y}}W_{a_i}) J_{\hat{y}} \end{aligned}$$
(25)

1.2 FF2 adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) T \end{aligned}$$
(26)
$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X \tanh (r_b) \end{aligned}$$
(27)
$$\begin{aligned} Z_a(k+1)=Z_a(k)+\eta e(k) X \tanh (r_a) \end{aligned}$$
(28)
$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X \end{aligned}$$
(29)
$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k)+\eta e(k) X Z_b {\mathrm{sech}}^2(r_b) J_{u} W_{b_{i}} \end{aligned}$$
(30)
$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X Z_a {\mathrm{sech}}^2(r_a) J_{\hat{y}} W_{a_{i}} \end{aligned}$$
(31)
$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X Z_b {\mathrm{sech}}^2(r_b) V_{b_i} J_u \end{aligned}$$
(32)
$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X Z_a {\mathrm{sech}}^2(r_a) V_{a_i} J_{\hat{y}} \end{aligned}$$
(33)

1.3 ARX adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) T \end{aligned}$$
(34)
$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X r_b \end{aligned}$$
(35)
$$\begin{aligned} Z_a(k+1)=Z_a(k)+\eta e(k) X r_a \end{aligned}$$
(36)
$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X \end{aligned}$$
(37)
$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k) + \eta e(k) X Z_b J_{u} W_{b_{i}} \end{aligned}$$
(38)
$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X Z_a J_{\hat{y}} W_{a_{i}} \end{aligned}$$
(39)
$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X Z_b V_{b_i} J_u \end{aligned}$$
(40)
$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X Z_a V_{a_i} J_{\hat{y}} \end{aligned}$$
(41)

1.4 NARX adaptation algorithms

$$\begin{aligned} X(k+1)=X(k)+\eta e(k) \tanh (T) \end{aligned}$$
(42)
$$\begin{aligned} Z_b(k+1)=Z_b(k)+\eta e(k) X {\mathrm{sech}}^2(T) r_b \end{aligned}$$
(43)
$$\begin{aligned} Z_a(k+1)=Z_a(k)+\eta e(k) X {\mathrm{sech}}^2(T) r_a \end{aligned}$$
(44)
$$\begin{aligned} Z_h(k+1)=Z_h(k)+ \eta e(k) X {\mathrm{sech}}^2(T) \end{aligned}$$
(45)
$$\begin{aligned} V_{b_{i}}(k+1)=V_{b_{i}}(k) + \eta e(k) X {\mathrm{sech}}^2(T) Z_b J_{u} W_{b_{i}} \end{aligned}$$
(46)
$$\begin{aligned} V_{a_{i}}(k+1)=V_{a_{i}}(k)+\eta e(k) X {\mathrm{sech}}^2(T) Z_a J_{\hat{y}} W_{a_{i}} \end{aligned}$$
(47)
$$\begin{aligned} W_{b_{i}}(k+1)=W_{b_{i}}(k)+\eta e(k) X {\mathrm{sech}}^2(T) Z_b V_{b_i} J_u \end{aligned}$$
(48)
$$\begin{aligned} W_{a_{i}}(k+1)=W_{a_{i}}(k)+\eta e(k) X {\mathrm{sech}}^2(T) Z_a V_{a_i} J_{\hat{y}} \end{aligned}$$
(49)

with \(\eta\), commonly referred to as learning rate, adapted according to the “search then convergence” algorithm presented in [8].

Appendix 2: Reduction methods applied to models FF2, ARX and NARX

Let us follow the proposed system identification procedure:

1.1 Model FF2

Step 1. Neural network training under particular assumptions Once the neural network model given by (4) is trained under Assumptions 1 and 2, we obtain:

$$\begin{aligned}&\hat{y}(k)=X^* T \nonumber\\&T=Z_b^* \varphi _2(r_b)+Z_a^* \varphi _2(r_a)+Z_h^*\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}^*(J_{u}W_{b_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}^*(J_{\hat{y}}W_{a_i}^*)} \end{aligned}$$
(50)

Step 2. Model transformation Since the final values of the synaptic weights are: \(W^*_{b_1}=W^*_{b_{j}},\,V^*_{b_1}=V^*_{b_{j}},\,W^*_{a_1}=W^*_{a_{j}}\) and \(V^*_{a_1}=V^*_{a_{j}}\) with \(j=2,3,\ldots ,nn\). Then, in (50), it is indeed possible to make the following algebraic operations:

$$\begin{aligned}&Z_B^*=X^*\times Z_b^*\\&Z_A^*=X^*\times Z_a^*\\&Z_H^*=X^*\times Z_h^*\\&W_{B_i}^*=V_{b_i}^*\times W_{b_i}^*\\&W_{A_i}^*=V_{a_i}^*\times W_{a_i}^*\\ \end{aligned}$$

Then, the 2nn-2-1 neuro-model given by (50) becomes:

$$\begin{aligned}&\hat{y}(k)=T\nonumber\\ &T=Z_B^*\varphi _2(r_b)+Z_A^*\varphi _2(r_a)+Z_H^* \nonumber \\ &r_b=\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}\end{aligned}$$
(51)

where \(W^*_{B_1}=W^*_{B_{j}}\) and \(W^*_{A_1}=W^*_{A_{j}}\) (with \(j=2,3,\ldots ,nn\)) due to Assumption 2.

A supplementary transformation is achieved in order to change the redefined 2nn-1 model (51) into a 2-1 representation by the following algebraic operations:

$$\begin{aligned}&\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}=nn \times (J_{u}W_{B_1}^*)=J_{u}W_{B}^*\\&\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}= nn \times (J_{y_{n}}W_{A_1}^*)=J_{y_{n}}W_{A}^* \end{aligned}$$

The resulting model after the model transformation has the following mathematical form:

$$\begin{aligned} \hat{y}(k)=Z_{B}^*\varphi _2(J_{u}W_B^*)+ Z_{A}^*\varphi _2(J_{\hat{y}}W_A^*)+Z_H^* \end{aligned}$$
(52)

1.2 Model ARX

Step 1. Neural network training under particular assumptions Once the neural network model given by (5) is trained under Assumptions 1 and 2, we obtain

$$\begin{aligned}&\hat{y}(k)=X^* T\nonumber\\&T=Z_b^* (r_b)+Z_a^* (r_a)+Z_h^*\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}^*(J_{u}W_{b_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}^*(J_{\hat{y}}W_{a_i}^*)} \end{aligned}$$
(53)

Step 2. Model transformation Since the final values of the synaptic weights are: \(W^*_{b_1}=W^*_{b_{j}},\,V^*_{b_1}=V^*_{b_{j}},\,W^*_{a_1}=W^*_{a_{j}}\) and \(V^*_{a_1}=V^*_{a_{j}}\) with \(j=2,3,\ldots ,nn\). Then, in (53), it is indeed possible to make the following algebraic operations:

$$\begin{aligned}&Z_{H}^*=X^*\times Z_h^*\\&W_{B_i}^*=X^*\times Z_b^* \times V_{b_i}^*\times W_{b_i}^*\\&W_{A_i}^*=X^*\times Z_a^* \times V_{a_i}^*\times W_{a_i}^* \end{aligned}$$

Then, the 2nn-2-1 neuro-model given by (53) becomes:

$$\begin{aligned}&\hat{y}(k)=T \nonumber \\ &T=r_b+r_a+Z_H^* \nonumber \\ &r_b=\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}\nonumber \\ &r_a=\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}\end{aligned}$$
(54)

where \(W^*_{B_1}=W^*_{B_{j}}\) and \(W^*_{A_1}=W^*_{A_{j}}\) (with \(j=2,3,\ldots ,nn\)) due to Assumption 2.

A supplementary transformation is achieved in order to change the redefined 2nn-1 model (54) into a 2-1 representation by the following algebraic operations:

$$\begin{aligned}&\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}=nn \times (J_{u}W_{B_1}^*)=J_{u}W_B^*\\&\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}= nn \times (J_{\hat{y}}W_{A_1}^*)=J_{\hat{y}}W_A^* \end{aligned}$$

The resulting model after the model transformation has the following mathematical form:

$$\begin{aligned} \hat{y}(k)=(J_{u}W_B^*)+ (J_{\hat{y}}W_A^*)+Z_H^* \end{aligned}$$
(55)

1.3 Model NARX

Step 1. Neural network training under particular assumptions Once the neural network model given by (6) is trained under Assumptions 1 and 2, we obtain:

$$\begin{aligned}&\hat{y}(k)=X^* \varphi _{3}(T)\nonumber\\&T=Z_b^* (r_b)+Z_a^* (r_a)+Z_h^*\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}^*(J_{u}W_{b_{i}}^*)}\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}^*(J_{\hat{y}}W_{a_i}^*)} \end{aligned}$$
(56)

Step 2. Model transformation Since the final values of the synaptic weights are: \(W^*_{b_1}=W^*_{b_{j}},\,V^*_{b_1}=V^*_{b_{j}},\,W^*_{a_1}=W^*_{a_{j}}\) and \(V^*_{a_1}=V^*_{a_{j}}\) with \(j=2,3,\ldots ,nn\).

Then, in (56), it is indeed possible to make the following algebraic operations:

$$\begin{aligned}&W_{B_i}^*=Z_b^* \times V_{b_i}^*\times W_{b_i}^*\\&W_{A_i}^*=Z_a^* \times V_{a_i}^*\times W_{a_i}^* \end{aligned}$$

Then, the 2nn-2-1 neuro-model given by (56) becomes:

$$\begin{aligned}&\hat{y}(k)=X^* \varphi _{3}(T) \nonumber\\ &T=r_b+r_a+Z_h^* \nonumber \\ &r_b=\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}\nonumber \\ &r_a=\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}\end{aligned}$$
(57)

A supplementary transformation is achieved in order to change the redefined 2nn-1 model (57) into a 2-1 representation by the following algebraic operations:

$$\begin{aligned}&\sum _{i=1}^{nn}{(J_{u}W_{B_{i}}^*)}=nn \times (J_{u}W_{B_1}^*)=J_{u}W_B^*\\&\sum _{i=1}^{nn}{(J_{\hat{y}}W_{A_i}^*)}= nn \times (J_{y_{n}}W_{A_1}^*)=J_{\hat{y}}W_A^* \end{aligned}$$

The resulting model after the model transformation has the following mathematical form:

$$\begin{aligned} \hat{y}(k)=X^* \varphi _{3}((J_{u}W_B^*)+ (J_{\hat{y}}W_A^*)+Z_h^*) \end{aligned}$$
(58)

Appendix 2: Neuro-identification of a flexible robot arm

In order to show the flexibility of the proposed model families, a second system is identified. The data come from a flexible robot arm (see Fig. 9), and the arm is installed on an electrical motor. The applied persisting exciting input, corresponding to the reaction torque of the structure on the ground, is a periodic sine sweep. The output of the system is the acceleration of the flexible arm. This system identification case is issued from an example through DaISy (database for the identification of systems). http://homes.esat.kuleuven.be/~smc/daisy/daisydata.html.

Fig. 9
figure 9

Flexible robot arm

The system is identified with the complex 2nn-2-1 neuro-model NARX (683 parameters before reduction) given by (59) with \(nn=40,\,n_a=8\) and \(n_b=7\). Notice that this neuro-model is derived from the proposed neural network architecture.

$$\begin{aligned}&\hat{y}(k)=X\tanh (T)\nonumber\\&T=Z_b(r_b)+Z_a(r_a)\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}\left( J_{u}W_{b_{i}}\right) }\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}\left( J_{\hat{y}}W_{a_i}\right) }\end{aligned}$$
(59)

Once the model (59) is trained under Assumptions 1 and 2, Theorem 1 (reduction approach) is applied in order to generate a reduced model of the form of (60).proposed approach has to be compared

$$\begin{aligned} \hat{y}(k)=X^*\tanh \left( J_{u}W_{B}^*+J_{\hat{y}}W_{A}^*\right) \end{aligned}$$
(60)

where the 16 parameters characterizing the system are as follows:

Proposed NARX model:

$$\begin{aligned}&W^*_B = [0.2883 \quad {-}0.3820 \quad 0.0655 \quad 0.1339 \quad {-}0.0821 \quad {-}0.2135 \quad 0.2335]\\&W^*_A = [{-}0.7387 \quad {-}0.0914 \quad 0.3221 \quad 0.3122 \quad {-}0.0812 \quad {-}0.3016 \quad {-}0.0196 \quad 0.3362]\\&X^*={-}1.4623\\ \end{aligned}$$

Naturally, we have to compare our approach with another black-box system identification method. For example, the same system is identified with the nonlinear system identification toolbox in MATLAB 2013. Here, the identification model used is the NLARX model, and the nonlinearity estimator is the sigmoid network with 5 units, \(n_a= 6\) and \(n_b=10\). The \(619\) parameters of the MATLAB model are estimated by using the Levenberg–Marquardt learning algorithm. The comparison, realized in terms of the number of parameters \(np\) and the validation results proposed in SYSID 2009 [39], is summarized in Table 3.

In order to compare the performances of both models, the frequency response function (FRF) that is mainly appreciated by the users in practical applications is computed from the measured data and the estimated output of both, the proposed neuro-model NARX and the NLARX model obtained by using the toolbox of MATLAB (see Fig. 10).

Fig. 10
figure 10

Frequency response function (FRF)

Table 3 A comparative assessment with respect to another parametric method

1.1 Comments

Figure 10 exposes the interest of using neural networks, since the estimated models accurately represent the system behavior in a large frequency range.

Confronting the two black-box models (see Table 3), the deviation errors (\(s_t\) and \(e_{{\mathrm{RMS}}e}\)) of the MATLAB model are only two times better, but the number of parameters is substantially larger (\(619\) vs \(16\)). Conversely, in terms of \(\mu _t\), the two models have almost the same accuracy, despite the use of a simple gradient training algorithm in the computation of the proposed NARX model. Remember that the main objective in this paper is to propose to the user a rather simple and efficient way to find a good balance between accuracy and complexity.

Appendix 3 :Neuro-identification of an acoustic duct

In order to show the flexibility of the proposed model families, a third system is identified. This experimental device is an acoustic waves guide made of Plexiglas, used to develop an active noise control (see Fig. 11). One end of the duct is almost anechoic and the other end is opened. The identification input signal is a pseudo-random binary sequence (PRBS) with a length \(L = 2^{10} - 1\) and level \(\pm\)3 V sufficiently exciting applied to the control loudspeaker. The sampling period is TS = 500 \(\upmu\)s. In order to model, the first propagative modes of the waves guide (also called the secondary path) which lye in the frequency range \([0; 1{,}000\,\mathrm{Hz}]\). We shall use several data set of the same length, namely 1,024, measured by the output microphone. A prior measurement of the propagation delay confirms the analytical value \(\tau \approx 7TS\).

Fig. 11
figure 11

Schematic of semi-finite acoustic waves guide

This system is identified with the complex 2nn-2-1 neuro-model FF1 (1,418 parameters before reduction) given by (61) with \(nn=40,\,n_a=15\) and \(n_b=18\).

$$\begin{aligned}&\hat{y}(k)=X T\nonumber\\&T=Z_b r_b+Z_a r_a+Z_h\nonumber \\&r_b=\sum _{i=1}^{nn}{V_{b_i}\tanh \left( J_{u}W_{b_{i}}\right) }\nonumber \\&r_a=\sum _{i=1}^{nn}{V_{a_i}\tanh \left( J_{\hat{y}}W_{a_i}\right) } \end{aligned}$$
(61)

Once the model is trained under Assumptions 1 and 2, Theorem 1 (reduction approach) is applied in order to generate a reduced model of the form of (62).

$$\begin{aligned} \hat{y}(k)=V_{B}^*\tanh (J_{u}W_{B}^*)+ V_{A}^*\tanh \left( J_{\hat{y}}W_{A}^*\right) \end{aligned}$$
(62)

where the 35 parameters characterizing the system are as follows:

$$\begin{aligned} W^*_A &= \left[ {-}0.0828 \quad 0.0614 \quad{-}0.0718 \quad {-}0.1201 \quad {-}0.0043 \quad {-}0.0067 \quad {-}0.0530\quad {-}0.0474\right. \\&\left. \quad 0.0032 \quad {-}0.0173 \quad{-}0.0381 \quad {-}0.0135 \quad {-}0.0062 \quad {-}0.0151 \quad {-}0.0140\right] \\W^*_B&= \left[ {-}0.0325 \quad {-}0.0555 \quad 0.0256 \quad0.0393 \quad 0.0007 \quad 0.0274 \quad 0.0132 \quad 0.0202 \quad0.0138\right. \\&\left.\quad 0.0006 \quad 0.0059 \quad {-}0.0139 \quad0.0040 \quad {-}0.0188 \quad {-}0.0110 \quad {-}0.0120 \quad {-}0.0002 \quad{-}0.0058\right] \end{aligned}$$

\(V^*_A=6.8643\) and \(V^*_B=3.4414\)

Naturally, the proposed approach has to be compared with another black-box system identification technique. Therefore, the same acoustic system is identified with the nonlinear system identification toolbox in MATLAB 2013. Here, the identification model used is the NLARX model, and the nonlinearity estimator is the sigmoid network with 5 units, \(n_a= 14\) and \(n_b=12\). The 1,519 parameters of the MATLAB model are estimated by using the Levenberg–Marquardt learning algorithm. The comparison, realized in terms of the number of parameters np and the validation results proposed in SYSID 2009 [39], is summarized in Table 4.

In order to compare the performances of both models in the frequency domain, the frequency response function (FRF) that is mainly appreciated by the users in practical applications is computed from the measured data and the estimated output of both, the proposed neuro-model NARX and the NLARX model obtained by using the toolbox of MATLAB (see Fig. 12).

Fig. 12
figure 12

Frequency response function (FRF)

Table 4 A comparative assessment with respect to another parametric method

1.1 Comments

Figure 12 illustrates the relevant performance of both, the proposed neuro-model FF1 and the MATLAB model, with respect to the real system behavior. But, if the number of model parameters is taken into consideration, it seems remarkable that with only 35 parameters the proposed model provides a significant level of accuracy.

From Table 4, comparing the proposed model (FF1) and the model obtained by using the toolbox of MATLAB both models almost have the same accuracy, but once again, the number of parameters of the MATLAB model is substantially larger (1,519 vs 35). Then, we are proposing to the user a system identification approach with results comparable to the toolbox of MATLAB, but with a smaller number of parameters.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Romero Ugalde, H.M., Carmona, JC., Reyes-Reyes, J. et al. Balanced simplicity–accuracy neural network model families for system identification. Neural Comput & Applic 26, 171–186 (2015). https://doi.org/10.1007/s00521-014-1716-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-014-1716-8

Keywords

Navigation