Abstract
While the gradient-learning algorithm with error back-propagation is a practical method of properly choosing the synaptic weights and thresholds of neurons, it provides no insight into the problem of how to choose the network architecture that is appropriate for the solution of a given problem. How many hidden layers are needed and how many neurons should be contained in each layer? If the number of hidden neurons is too small, no choice of the synapses may yield the accurate mapping between input and output, and the network will fail in the learning stage. If the number is too large, many different solutions will exist, most of which will not result in the ability to generalize correctly for new input data, and the network will usually fail in the operational stage. Instead of learning salient features of the underlying input—output relationship, the network simply learns to distinguish somehow between the various input patterns of the training set and to associate them with the correct output.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Notes
This technique was exploited by Edgar Allen Poe in “The Purloined Letter”.
Details of the network structure have a strong influence on which problems are “simple” and which are “hard” to learn. It would be very helpful to have a way of estimating H(F) for various functions F on a given network without explicitly counting all realizations, but such a method is not known. In many cases, problems intuitively considered “simple” are also simple in the technical sense defined here, but this rule is not generally valid.
In the sense of the scalar product, the two input patterns are even orthogonal, since they have no active neuron in common.
A trivial but not very elegant preprocessor for translationally invariant pattern recognition would simply shift the input pattern slowly around until it “locks in” with one of the stored patterns [Do88].
The system studied by Fuchs and Haken was not a neural network, but a content-addressable memory built from nonlinearly coupled synergetic units [Ha87]. One can expect, however, that the preprocessor coupled to a Hopfield-type neural network would perform similarly.
See on combinatorial optimization.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Müller, B., Reinhardt, J., Strickland, M.T. (1995). Network Architecture and Generalization. In: Neural Networks. Physics of Neural Networks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57760-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-57760-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60207-1
Online ISBN: 978-3-642-57760-4
eBook Packages: Springer Book Archive