Skip to main content
Log in

Improving backpropagation learning with feature selection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

There exist redundant, irrelevant and noisy data. Using proper data to train a network can speed up training, simplify the learned structure, and improve its performance. A two-phase training algorithm is proposed. In the first phase, the number of input units of the network is determined by using an information base method. Only those attributes that meet certain criteria for inclusion will be considered as the input to the network. In the second phase, the number of hidden units of the network is selected automatically based on the performance of the network on the training data. One hidden unit is added at a time only if it is necessary. The experimental results show that this new algorithm can achieve a faster learning time, a simpler network and an improved performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T. Ash, “Dynamic node creation in backpropagation networks,” Connection Science, vol. 1, no. 4, pp. 365–375, 1989.

    Google Scholar 

  2. C.G. Broyden, “The convergence of a class of double rank minimization, Algorithm 2, the new algorithm,” Journal of the Institute of Mathematics and Applications, no. 6, pp. 222–231, 1970.

  3. S.E. Fahlman and C. Lebiere, “The cascade-correlation learning architecture,” in Advances in Neural Information Processing Systems II, edited by D. Touretzky, Morgan Kaufmann: San Mateo, CA, pp. 524–532, 1989.

    Google Scholar 

  4. R. Fletcher, “A new approach to variable metric algorithms,” Computer Journal, no. 13, pp. 317–322, 1970.

  5. M. Frean, “The upstart algorithm: a method for constructing and training feedforward neural networks,” Neural Computation, vol. 2, no. 2, pp. 198–209, 1990.

    Google Scholar 

  6. D. Goldfarb, “A family of variable metric algorithms derived by variational means,” Mathematics of Computation, no. 24, pp. 23–26, 1970.

  7. W. Iba, J. Wogulis, and P. Langley, “Trading off simplicity and coverage in incremental concept learning,” in Proceedings of the 5th International Conference on Machine Learning, Ann Arbor, Michigan, 1988, pp. 73–79.

  8. K.J. Lang and M.J. Witbrock, “Learning to tell two spirals apart,” in Proceedings of the 1988 Connectionist Model Summer School, edited by D. Touretzky, G. Hinton, and T. Sejnowski, Morgan Kaufmann: San Mateo, CA, 1988, pp. 52–59.

    Google Scholar 

  9. M. Mezard and J.P. Nadal, “Learning in feedforward layered networks: The tiling algorithm,” Journal of Physics A, vol. 22, no. 12, pp. 2191–2203, 1989.

    Google Scholar 

  10. P.M. Murphy and D.W. Aha, UCI Repository of machine learning databases [Machine-readable data repository], University of California, Department of Information and Computer Science, Irvine, CA, 1992.

    Google Scholar 

  11. A. van Ooyen and B. Nienhuis, “Improving the convergence of the backpropagation algorithm,” Neural Networks, vol. 5, pp. 465–471, 1992.

    Google Scholar 

  12. P.K.H. Phua and R. Setiono, “Combined quasi-Newton updates for unconstrained optimization,” Dept. of Information Systems and Computer Science, National University of Singapore, Technical Report TR41/92, 1992.

  13. J.R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, 1986.

    Google Scholar 

  14. J.S. Schlimmer, “Concept acquisition through representational adjustment,” Dept. of Information and Computer Science, University of California, Irvine, Technical Report 87–19, 1987.

    Google Scholar 

  15. R. Setiono and L.C.K. Hui, “Use of quasi-Newton method in a feedforward neural network construction algorithm,” IEEE Transactions on Neural Networks, vol. 6, no. 1, pp. 273–277, 1995.

    Google Scholar 

  16. R. Setiono, “A neural network construction algorithm which maximizes the likelihood function,” Connection Science, vol. 7, no. 2, pp. 147–166, 1995.

    Google Scholar 

  17. D.F. Shanno, “Conditioning of quasi-Newton methods for function minimization,” Mathematics of Computation, no. 24, pp. 647–656, 1970.

  18. M.F. Tenorio and W. Lee, “Self-organizing network for optimum supervised learning,” IEEE Transactions on Neural Networks, vol. 1, no. 1, pp. 100–110, 1990.

    Google Scholar 

  19. S.B. Thrun et al., “The MONK's Problems—A performance comparison of different learning algorithms,” Department of Computer Science, Carnegie Mellon University, CMU-CS-91-197, 1991.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Setiono, R., Liu, H. Improving backpropagation learning with feature selection. Appl Intell 6, 129–139 (1996). https://doi.org/10.1007/BF00117813

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00117813

Keywords

Navigation