Advertisement

A Sequential Training Strategy for Locally Recurrent Neural Networks

  • Jie Zhang
  • A. Julian Morris

Abstract

In locally recurrent neural networks, the output of a dynamic neuron is only fed back to itself. This particular structure makes it possible to train the network sequentially. A sequential orthogonal training method is developed in this chapter to train locally recurrent neural networks. The networks considered here contain a single-hidden-layer and dynamic neurons are located in the hidden layer. During network training, the first hidden neuron is used to model the relationship between inputs and outputs whereas other hidden neurons are added sequentially to model the relationship between inputs and model residuals. When adding a hidden neuron, its contribution is due to that part of its output vector which is orthogonal to the space spanned by the output vectors of the previous hidden neurons. The Gram-Schmidt orthogonalisation technique is used at each training step to form a set of orthogonal bases for the space spanned by the hidden neuron outputs. The optimum hidden layer weights can be obtained through gradient based optimisation method while the output layer weights can be found using least squares regression. Hidden neurons are added sequentially and the training procedure terminates when the model error is lower than a predefined level. Using this training method, the necessary number of hidden neurons can be found and, hence, avoiding the problem of over fitting. Neurons with mixed types of activation functions and dynamic orders can be incorporated into a single network. Mixed node networks can offer improved performance in terms of representation capabilities and network size parsimony. The excellent performance of the proposed technique is demonstrated by application examples.

Keywords

Hide Neuron Model Predictive Control Recurrent Neural Network Distillation Column Continuous Stir Tank Reactor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Ballard, D.H. (1988), “Cortical connections and parallel processing: structure and function,” in Vision,Brain, and Cooperative Computation, ed. M. Arbib and Hamson, MIT Press, 563–621.Google Scholar
  2. [2]
    Bhat, N.V. and T.J. McAvoy (1990), “Use of neural nets for dynamical modelling and control of chemical process systems,” Computers and Chemical Engineering 14, 573–583.CrossRefGoogle Scholar
  3. [3]
    Chen, S., S.A. Billings, and P.M. Grant (1990), “Nonlinear system identification using neural networks,” Int. J. Control 51, 1191–1214.MathSciNetzbMATHCrossRefGoogle Scholar
  4. [4]
    Elman, J. L. (1990), “Finding structures in time,” Cognitive Science 14, 179–211.CrossRefGoogle Scholar
  5. [5]
    Fahlman, S. (1990), “The cascade-correlation learning architecture,” in Advances in Neural Information Processing Systems 2, ed. D. Touretzky, Morgan Kaufmann, 524–532.Google Scholar
  6. [6]
    Fahlman, S. (1991), “The recurrent cascade-correlation architecture,” in Advances in Neural Information Processing Systems 3, ed. R. Lipp-maim, J. Moody, and D. Touretzky, Morgan Kaufmann, 190–196.Google Scholar
  7. [7]
    Frasconi, P., M. Gori, and G. Soda (1992), “Local feedback multilayered networks,” Neural Computation 4, 120–130.CrossRefGoogle Scholar
  8. [8]
    Haesloop, D. and B.R. Holt (1990), “A neural network structure for system identification,” Proc. ACC, 2460–2465.Google Scholar
  9. [9]
    Holcomb, T. and M. Morari (1992), “PLS/Neural networks,” Computers and Chemical Engineering 16:4, 393–411.CrossRefGoogle Scholar
  10. [10]
    Ku, C.C. and K. Y. Lee (1995), “Diagonal recurrent neural networks for dynamic systems control,” IEEE Transactions on Neural Networks 6:1, 144–155.CrossRefGoogle Scholar
  11. [11]
    McAvoy, T. J., E. Hsu, and S. Lowenthal (1972), “Dynamics of pH in controlled stirred tank reactor,” Ind. Eng. Chem. Process Des. Develop. 11:1, 68–70.CrossRefGoogle Scholar
  12. [12]
    Miller, C. and C. Giles (1993), “Experimental comparison of the effect of order in recurrent neural networks,” Int. J. Pattern Recognition Artificial Intell. 7:4, 849–872.CrossRefGoogle Scholar
  13. [13]
    Montague, G. A., M. T. Tham, M. J. Willis, and A. J. Morris (1992), “Predictive control of distillation columns using dynamic neural networks,” 3rd IFAC Symposium on Dynamics and Control of Chemical Reactors,Distillation Columns, and Batch Processes, Maryland, USA, 231–236.Google Scholar
  14. [14]
    Mozer, M. C. and P. Smolensky (1989), “Skeletonization: a technique for trimming the fat from a network via relevance assessment,” Connection Science 11, 3–26.CrossRefGoogle Scholar
  15. [15]
    Narendra, K. S. and K. Parthasarathy (1990), “Identification and control of dynamical systems using neural networks,” IEEE Transactions on Neural Networks 1:1, 4–27.CrossRefGoogle Scholar
  16. [16]
    Scott, G. M. and W. H. Ray (1993), “Creating efficient nonlinear neural network process models that allow model interpretation,” Journal of Process Control 3:3, 163–178.CrossRefGoogle Scholar
  17. [17]
    Scott, G. M. and W. H. Ray (1993), “Experiences with model-based controllers based on neural network process models,” Journal of Process Control 3:3, 179–196.CrossRefGoogle Scholar
  18. [18]
    Solla, S. (1992), “Capacity control in classifiers for pattern recognition,” Proc. IEEE Workshop on Neural Networks for Signal Processing II, ed. S. Kung, F. Fallside, J.A. Sorenson, and C. Kamm, 255–266.Google Scholar
  19. [19]
    Su, H.T., T. J. McAvoy, and P. Werbos (1992), “Long-term prediction of chemical processes using recurrent neural networks: a parallel training approach,” Ind. Eng. Chem. Res. 31, 1338–1352.CrossRefGoogle Scholar
  20. [20]
    Tsoi, A. C. and A. D. Back (1994), “Locally recurrent globally feedforward networks: a critical review of architectures,” IEEE Transactions on Neural Networks 5:2, 229–239.CrossRefGoogle Scholar
  21. [21]
    Wang, Z., C. Di Massimo, G.A. Montague, and A. J. Morris (1994), “A procedure for determining the topology of feed forward neural networks,” Neural Networks 7, 291–300.CrossRefGoogle Scholar
  22. [22]
    Werbos, P. J. (1990), “Backpropagation through time: what it does and how to do it,” Proceedings of IEEE 78, 1550–1560.CrossRefGoogle Scholar
  23. [23]
    Willis, M. J., C. Di Massimo, G. A. Montague, M. T. Tham, and A. J. Morris (1991), “On artificial neural networks in process engineering,” Proceedings of IEE, Part D 138, 256–266.Google Scholar
  24. [24]
    Zhang, J., A. J. Morris, G. A. Montague, and M. T. Tham (1994), “Dynamic system modelling using mixed node neural networks,” in preprint of IFAC Symposium ADCHEM’94, Kyoto, Japan, May 25–27, 114–119.Google Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Jie Zhang
    • 1
  • A. Julian Morris
    • 1
  1. 1.Centre for Process Analysis, Chemometrics and Control Dept of Chemical & Process EngineeringUniversity of NewcastleNewcastle upon TyneUK

Personalised recommendations