Neural Computing and Applications

, Volume 16, Issue 4–5, pp 317–325 | Cite as

Controlling the parallel layer perceptron complexity using a multiobjective learning algorithm

  • D. A. G. Vieira
  • J. A. Vasconcelos
  • W. M. Caminhas
Original Article

Abstract

This paper deals with the parallel layer perceptron (PLP) complexity control, bias and variance dilemma, using a multiobjective (MOBJ) training algorithm. To control the bias and variance the training process is rewritten as a bi-objective problem, considering the minimization of both training error and norm of the weight vector, which is a measure of the network complexity. This method is applied to regression and classification problems and compared with several other training procedures and topologies. The results show that the PLP MOBJ training algorithm presents good generalization results, outperforming traditional methods in the tested examples.

Keywords

Parallel layer perceptron Neural networks Learning algorithms Machine learning Multiobjective training algorithm 

References

  1. 1.
    Bartlett PL (1998) The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Trans Inf Theory 44(2):525–536MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Caminhas WM, Vieira DAG, Vasconcelos JA (2003) Parallel layer perceptron. Neurocomputing 55(3–4):771–778CrossRefGoogle Scholar
  3. 3.
    Chinrungrueng C, Séquin CH (1995) Optimal adaptive k-means algorithm with dynamic adjustment of learning rate. IEEE Trans Neural Netw 6:157–169CrossRefGoogle Scholar
  4. 4.
    Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–279MATHGoogle Scholar
  5. 5.
    Costa MA, Braga AP, Menezes BR, Teixiera RA, Parma GG (2003) Training neural networks with a multi-objective sliding mode control algorithm. Neurocomputing 51:467–473CrossRefGoogle Scholar
  6. 6.
    Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley-Interscience, New YorkMATHGoogle Scholar
  7. 7.
    Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. In: Touretzky D (ed) Advances in neural information processing systems, vol 2. Morgan Kaufmann, San MateoGoogle Scholar
  8. 8.
    Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias-variance dilemma. Neural Comput 4(1):1–58Google Scholar
  9. 9.
    Hangan MT, Menjah MB (1994) Training feedforward network with the Marquardt algorithm. IEEE Trans Neural Netw 5(6):989–993CrossRefGoogle Scholar
  10. 10.
    Ismail MA, Kamel MS (1989) Multidimensional data clustering utilizing hybrid strategies. Pattern Recognit 22:75–89MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Ismail MA, Selim SZ, Aror SK (1984) Efficient clustering of multidimensional data. In: Proceedings of the IEEE international conference on systems man and cybernetics, pp 120–123Google Scholar
  12. 12.
    Kearns MJ, Schapire RE (1990) Efficient distribution-free learning of probabilistic concepts (Abstract). In: COLT ’90: Proceedings of the 3rd annual workshop on computational learning theoryGoogle Scholar
  13. 13.
    Lacerda E, Carvalho A, Braga AP, Ludermir TB (2005) Using evolutionary RBF networks for credit assessment. Appl Intell 22(3):167–182CrossRefGoogle Scholar
  14. 14.
    Llyod SP (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137CrossRefGoogle Scholar
  15. 15.
    MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkley symposium mathematical statistics and probability, vol 1, pp 281–297Google Scholar
  16. 16.
    Parekh R, Yang J, Honavar V (1987) Constructive neural network learning algorithms for multi-category real-valued pattern classification. Technical report, Iowa State University, Department of Computer ScienceGoogle Scholar
  17. 17.
    Sexton R, Dorsey R (2000) Reliable classification using neural networks: a genetic algorithm and backpropagation comparison. IEEE Trans Knowl Data Eng 30:11–22Google Scholar
  18. 18.
    Shawe-Taylor J, Bartlett PL (1998) Structural risk minimization over data-dependent hierarchies. IEEE Trans Inf Theory 44(5):1926–1940MATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Shor NZ (1977) Cut-off method with space extension in convex programming problems. Cybernetics 12:94–96Google Scholar
  20. 20.
    Takahashi RH, Peres PLD, Ferreira PAV (1997) H2/h-infinity multiobjective pid design. IEEE Control Syst 15(5):37–34CrossRefGoogle Scholar
  21. 21.
    Teixeira RA (2001) Treinamento de Redes Neurais Artificias Atraves de Otimizatpo Multi-Objetivo: Uma Nova Abordagem para o Equilibro entre a Polarizacao e a Variancia. PhD Thesis, CPDEE- UFMGGoogle Scholar
  22. 22.
    Teixiera RA, Braga AP, Takaha R, Saldanha RR (2000) Improving generalization of MLPs with multi-objective optimization. Neurocomputing 35:189–194CrossRefGoogle Scholar
  23. 23.
    Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATHGoogle Scholar
  24. 24.
    Vapnik VN (2001) The nature of statistical learning theory, Statistics for Engineering and Information Science, 2nd edn. Springer, Berlin Heidelberg New YorkGoogle Scholar
  25. 25.
    Yao X (1993) a review of evolutionary artificial neural networks. Int J Intell Syst 8:539–567Google Scholar

Copyright information

© Springer-Verlag London Limited 2006

Authors and Affiliations

  • D. A. G. Vieira
    • 1
  • J. A. Vasconcelos
    • 1
  • W. M. Caminhas
    • 1
  1. 1.Department of Electrical EngineeringFederal University of Minas GeriasBelo HorizonteBrazil

Personalised recommendations