# Vapnik-Chervonenkis dimension of recurrent neural networks

Conference paper

First Online:

## Abstract

Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number

*w*of weights in the network, and the length*k*of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of*w*only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case*k≫w*. Ignoring multiplicative constants, the main results say roughly the following:-
For architectures with activation

*σ*= any fixed nonlinear polynomial, the VC dimension is ≈*wk*. -
For architectures with activation

*σ*= any fixed*piecewise*polynomial, the VC dimension is between*wk*and w^{2}k. -
For architectures with activation

*σ*= H (threshold nets), the VC dimension is between*w*log(*k/w*) and min{*wk*log*wk, w*^{2}+w log*wk*}. -
For the standard sigmoid

*σ(x)*=1/(1+*e*^{−x}), the VC dimension is between*wk*and w^{4}k^{2}.

## Preview

Unable to display preview. Download preview PDF.

## References

- 1.E.B. Baum and D. Haussler, “What size net gives valid generalization?”,
*Neural Computation*, 1 (1989), pp. 151–160.Google Scholar - 2.Y. Bengio,
*Neural Networks for Speech and Sequence Recognition*, Thompson Computer Press, Boston, 1996.Google Scholar - 3.T.M. Cover, “Capacity problems for linear machines”, in:
*Pattern Recognition*(L. Kanal ed.),*Thompson Book Co.*, 1968, pp. 283–289Google Scholar - 4.B. Dasgupta and E.D. Sontag, “Sample complexity for learning recurrent perceptron mappings,”
*IEEE Trans. Inform. Theory*, September 1996, to appear. (Summary in*Advances in Neural Information Processing Systems 8 (NIPS95*) (D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, eds.), MIT Press, Cambridge, MA, 1996, pp. 204–210.)Google Scholar - 5.C.L. Giles, G.Z. Sun, H.H. Chen, Y.C. Lee and D. Chen, “Higher order recurrent networks and grammatical inference”, in
*Advances in Neural Information Processing Systems 2*, D.S. Touretzky (ed.), Morgan Kaufmann, San Mateo, CA, 1990.Google Scholar - 6.P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parametrized by real numbers,”
*Machine Learning***18**(1995), pp. 131–148.Google Scholar - 7.M. Karpinski and A. Macintyre, “Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks,”
*J. Computer Sys. Sci.*, to appear. (Summary in “Polynomial bounds for VC dimension of sigmoidal neural networks,” in*Proc. 27th ACM Symposium on Theory of Computing, 1995*, pp. 200–208).Google Scholar - 8.P. Koiran and E.D. Sontag, “Neural networks with quadratic VC dimension,”
*J. Computer Sys. Sci.*, to appear. (Summary in*Advances in Neural Information Processing Systems 8 (NIPS95)*(D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, eds.), MIT Press, Cambridge, MA, 1996, pp. 197–203.)Google Scholar - 9.M. Matthews, “A state-space approach to adaptive nonlinear filtering using recurrent neural networks,”
*Proc. 1990 IASTED Symp. on Artificial Intelligence Applications and Neural Networks*, Zürich, pp. 197–200, July 1990.Google Scholar - 10.M.M. Polycarpou, and P.A. Ioannou, “Neural networks and on-line approximators for adaptive control,” in
*Proc. Seventh Yale Workshop on Adaptive and Learning Systems*, pp. 93–798, Yale University, 1992.Google Scholar - 11.H. Siegelmann and E.D. Sontag, “On the computational power of neural nets,”
*J. Comp. Syst. Sci.***50**(1995): 132–150.Google Scholar - 12.H. Siegelmann and E.D. Sontag, “Analog computation, neural networks, and circuits,”
*Theor. Comp. Sci.***131**(1994): 331–360.Google Scholar - 13.E.D. Sontag,
*Mathematical Control Theory: Deterministic Finite Dimensional Systems*, Springer, New York, 1990.Google Scholar - 14.E.D. Sontag, “Neural nets as systems models and controllers,” in
*Proc. Seventh Yale Workshop on Adaptive and Learning Systems*, pp. 73–79, Yale University, 1992.Google Scholar - 15.E.D. Sontag, “Feedforward nets for interpolation and classification,”
*J. Comp. Syst. Sci.***45**(1992): 20–48.Google Scholar - 16.A.M. Zador and B.A. Pearlmutter, “VC dimension of an integrate-and-fire neuron model,”
*Neural Computation***8**(1996): 611–624.Google Scholar

## Copyright information

© Springer-Verlag Berlin Heidelberg 1997