Abstract
There are currently several types of constructive, (or growth), algorithms available for training a feed-forward neural network. This paper describes and explains the main ones, using a fundamental approach to the multi-layer perceptron problem-solving mechanisms. The claimed convergence properties of the algorithms are verified using just two mapping theorems, which consequently enables all the algorithms to be unified under a basic mechanism. The algorithms are compared and contrasted and the deficiencies of some highlighted. The fundamental reasons for the actual success of these algorithms are extracted, and used to suggest where they might most fruitfully be applied. A suspicion that they are not a panacea for all current neural network difficulties, and that one must somewhere along the line pay for the learning efficiency they promise, is developed into an argument that their generalization abilities will lie on average below that of back-propagation.
Similar content being viewed by others
References
S. Ahmad and G. Tesauro, A study of scaling and generalization in neural networks,Abstracts First Ann. Mtg. INNS, Suppl. to Neural Networks, vol. 1, Boston, Massachusetts, September, 1988.
T. Ash, Dynamic node creation in backpropagation networks,Connection Science,1(4), 1989.
K. P. Bennett and O. L. Mangasarian, Neural network training via linear programming,Technical Report 948, Computer Sciences Department, University of Wisconsin-Madison, July, 1990.
J. Denker, D. Schwartz, B. Wittner, S. Solla, R. Howard, L. Jackel, and J. Hopfield, Large automatic learning, rule extraction and generalization,Complex Systems,1(5), 1987.
S. E. Fahlman, Faster-learning variations on back-propagation: an empirical study,Proc. 1988 Connectionist Models Summer School, 1988.
S. E. Fahlman and C. Lebiere, The cascade-correlation learning architecture, D. S. Touretzky, ed.,Advances in Neural Information Processing Systems 2, Morgen Kaufmann, 1990.
M. Frean, The upstart algorithm: a method for constructing and training feedforward neural networks,Neural Computation,1:198–209, 1990.
R. M. French, Using semi-distributed representations to overcome catastrophic forgetting in connectionist networks,Technical Report CRCC-51-1991, Bloomington, Indiana: Indiana University, 1991.
S. I. Gallant, Three constructive algorithms for network learning,Prog. 8th Ann. Conf. Cognitive Science Society, Hillsday, NJ, August, 1986.
S. I. Gallant, Perceptron-based learning algorithms,IEEE Trans. Neural Networks,1(2), 1990.
I. Guyon, Applications of neural networks to character recognition,Int. J. Pattern Recognition and Artificial Intelligence,5(1–2):353–382, 1991.
G. E. Hinton, Learning distributed representations of concepts,Eighth Ann. Conf. Cognitive Sci. Soc., 1986.
R. A. Jacobs and M. I. Jordan, Adaptive mixtures of local experts,Neural Computation,3(1), 1991.
S. Judd, Learning in networks is hard,Proc. IEEE First Int. Conf. Neural Networks, San Diego, pp. 685–692, 1987.
O. L. Mangasarian, Multisurface method of pattern separation,IEEE Trans. Information Theory, IT-14(6):801–807, 1968.
M. Mézard and J-P. Nadal, Learning in feedforward layered networks: The tiling algorithm,Journal of Physics A,22(12):2191–2203, 1989.
M. Minsky and S. Papert,Perceptrons, MIT Press, 1969.
H. Mühlenbein, Limitations of multilayer perceptrons — steps towards genetic neural networks,Parallel Computing,14(3):249–260, 1990.
J-P. Nadal, Study of a growth algorithm for a feedforward neural network,Int. J. Neural Systems,1(1):55–59, 1989.
F. Rosenblatt,Principles of Neurodynamics, New York: Spartan Books, 1959.
P. Ruján, Reliable training required negative examples,Neural Network Review,3(3): 123–125, 1990.
P. Ruján and M. Marchand, Learning by minimizing resources in neural networks,Complex Systems,3:229–241, 1989.
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal representations by error propagation,Nature,323(533), 1986.
A. Schüte, How can multi-layer perceptrons classify,Proc. Workshop on Distributed Adaptive Neural Inf. Proc, GMD, St. Augustin, Germany: Oldenbourg Verlag, April, 1989.
T. J. Sejnowski and C. R. Rosenberg, NETtalk: A parallel network that learns to read aloud,Complex Systems,1(1), 1987.
H. Shvaytser, Even simple nets cannot be trained reliably with a polynomial number of examples,Proc. Int. J. Conf. Neural Networks,2, pp. 141–145, Washington, D.C., New York: IEEE Press, June, 1989.
F. J. Śmieja, MLP solutions, generalization and hidden-unit representations,Proc. Workshop on Distributed Adaptive Neural Inf. Proc., St. Augustin, Germany: Oldenbourg Verlag, April, 1989. See also Edinburgh University Physics preprint 89/461.
F. J. Śmieja, Hyperplane “spin” dynamics, network plasticity and back-propagation learning,Technical Report No. 634, Gesellschaft für Mathematik und Datenverarbeitung, St. Augustin, Germany, November, 1991. Submitted toComplex Systems.
F. J. Śmieja, Multiple network systems (MINOS) modules: Task division and module discrimination,Proc. 8th AISB Conf. Artificial Intelligence, Leeds, 16–19 April, 1991. Also available as GMD Technical Report No. 638.
F. J. Śmieja and H. Mühlenbein, The geometry of multilayer perceptron solutions,Parallel Computing,14:261–275, 1990.
F. J. Śmieja and H. Mühlenbein, Reflective modular neural network systems,Technical Report No. 633, GMD, St. Augustin, Germany, February, 1992.
F. J. Śmieja and G. D. Richards, Hard learning the easy way: Backpropagation and deformation,Complex Systems,2(4), 1988.
G. Tesauro, Scaling relationships in backpropagation learning: dependence on training set size,Complex Systems,1:367–372, 1987.
G. Tesauro and B. Janssens, Scaling relationships in backpropagation learning,Complex Systems,2:39–44, 1988.
A. Waibel, Modular construction of time-delay neural networks for speech recognition,Neural Computation,1, 1989.
Author information
Authors and Affiliations
Additional information
Funded by the German Ministry of Research and Technology, grant number 01 IN 111 A/4.
German National Research Centre for Computer Science (GMD), Schloß Birlinghoven, 5205 St. Augustin 1, Germany.
Rights and permissions
About this article
Cite this article
Śmieja, F.J. Neural network constructive algorithms: Trading generalization for learning efficiency?. Circuits Systems and Signal Process 12, 331–374 (1993). https://doi.org/10.1007/BF01189880
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF01189880