Perspectives of Current Research about the Complexity of Learning on Neural Nets

  • Wolfgang Maass


This chapter discusses within the framework of computational learning theory the current state of knowledge and some open problems in three areas of research about learning on feedforward neural nets:
  • - Neural nets that learn from mistakes

  • - Bounds for the Vapnik-Chervonenkis dimension of neural nets

  • - Agnostic PAC-learning of functions on neural nets.


Activation Function Network Architecture Target Concept Programmable Parameter Hypothesis Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [A]
    D. Angluin, “Queries and concept learning”, Machine Learning, vol. 2, 1988, 319–342.Google Scholar
  2. [AB]
    M. Anthony, N. Biggs, ”Computational Learning Theory”, Cambridge University Press, 1992.Google Scholar
  3. [Au]
    P. Auer, “On-line learning of rectangles in noisy environments”, Proc. of the 6th Annual ACM Conference on Computational Learning Theory, 1993, 253–261.Google Scholar
  4. [BW]
    P. L. Bartlett, R. C. Williamson, “The VC-dimension and pseudodimension of two-layer neural networks with discrete inputs”, preprint (1993).Google Scholar
  5. [BH]
    E. B. Baum, D. Haussler, “What size net gives valid generalization?”, Neural Computation, vol. 1, 1989, 151–160.CrossRefGoogle Scholar
  6. [BR]
    A. Blum, R. L. Rivest, “Training a 3-node neural network is NP-complete”, Proc. of the 1988 Workshop on Computational Learning Theory, Morgan Kaufmann (San Mateo, 1988), 9–18.Google Scholar
  7. [BEHW]
    A. Blumer, A. Ehrenfeucht, D. Haussler, M. K. Warmuth, “Learnability and the Vapnik-Chervonenkis dimension”, J. of the ACM, vol. 36(4), 1989, 929–965.MathSciNetMATHCrossRefGoogle Scholar
  8. [CM]
    Z. Chen, W. Maass, “On-line learning of rectangles”, Proc. of the 5th Annual ACM Workshop on Computational Learning Theory 1992, 16–28.Google Scholar
  9. [C 64]
    T. M. Cover, “Geometrical and statistical properties of linear threshold devices”, Stanford PH. D. Thesis 1964, Stanford SEL Technical Report No. 6107-1, May 1964.Google Scholar
  10. [C 68]
    T. M. Cover, “Capacity problems for linear machines”, in: Pattern Recognition, L. Kanal ed., Thompson Book Co., 1968, 283–289.Google Scholar
  11. [E]
    H. Edelsbrunner, “Algorithms in Combinatorial Geometry”, EATCS Monographs on Theoretical Computer Science, vol. 10, Springer (Berlin, New York), 1987.Google Scholar
  12. [EHKV]
    A. Ehrenfeucht, D. Haussler, M. Kearns, L. Valiant, “A general lower bound on the number of examples needed for learning”, Information and Computation, vol. 82, 1989, 247–261.MathSciNetMATHCrossRefGoogle Scholar
  13. [G]
    S. I. Gallant, “Neural Network Learning and Expert Systems”, MIT Press (Cambridge, 1993).MATHGoogle Scholar
  14. [GoJ]
    P. Goldberg, M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers”, Proc. of the 6th Annual ACM Conference on Computational Learning Theory, 1993, 361–369.Google Scholar
  15. [H]
    J. Hastad, “On the size of weights for threshold gates”, preprint (1992).Google Scholar
  16. [Ha]
    D. Haussler, “Decision theoretic generalizations of the PAC model for neural nets and other learning applications”, Information and Computation, vol. 100, 1992, 78–150.MathSciNetMATHCrossRefGoogle Scholar
  17. [HKLW]
    D. Haussler, M. Kearns, N. Littlestone, M. K. Warmuth, “Equivalence of models for polynomial learnability”, Information and Computation, vol. 95, 1991, 129–161.MathSciNetMATHCrossRefGoogle Scholar
  18. [HSV]
    K. U. Hoeffgen, H. U. Simon, K. S. Van Horn, “Robust trainability of single neurons”, to appear.Google Scholar
  19. [Ho]
    R. C. Holte, “Very simple classification rules perform well on most commonly used datasets”, Machine Learning, vol. 11, 1993, 63–91.MATHCrossRefGoogle Scholar
  20. [J]
    J. S. Judd, “Neural Network Design and the Complexity of Learning”, MIT-Press (Cambridge, 1990).Google Scholar
  21. [KV]
    M. Kearns, L. Valiant, “Cryptographic limitations on learning boolean formulae and finite automata”, Proc. of the 21st ACM Symposium on Theory of Computing, 1989, 433–444.Google Scholar
  22. [KS]
    M. Kearns, R. E. Schapire, “Efficient distribution free learning of probabilistic concepts”, Proc. of the 31st IEEE Symposium on Foundations of Computer Science, 1990, 382–391.Google Scholar
  23. [KSS]
    M. J. Kearns, R. E. Schapire, L. M. Sellie, “Toward efficient agnostic learning”, Proc. of the 5th ACM Workshop on Computational Learning Theory, 1992, 341–352.Google Scholar
  24. [Li]
    N. Littlestone, “Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm”, Machine Learning, vol. 2, 1988, 285–318.Google Scholar
  25. [L]
    O. B. Lupanov, “On circuits of threshold elements”, Dokl. Akad. Nauk SSSR, vol. 202, 1288–1291; engl. translation in: Sov. Phys. Dokl, vol. 17, 1972, 91-93.Google Scholar
  26. [M 93a]
    W. Maass, “Bounds for the computational power and learning complexity of analog neural nets (Extended Abstract)”, Proc. of the 25th Annual ACM Symposium on the Theory of Computing, 1993, 335–344.Google Scholar
  27. [M 93b]
    W. Maass, “Agnostic PAC-Learning of Functions on Analog Neural Nets”, an extended abstract appears in the Proc. of the 7th Annual IEEE Conference on Neural Information Processing Systems 1993; the full paper appears in Neural Computation.Google Scholar
  28. [M 93c]
    W. Maass, “Neural Nets with Superlinear VC-Dimension”, to appear in Neural Computation.Google Scholar
  29. [MSS]
    W. Maass, G. Schnitger, E. D. Sontag, “On the computational power of sigmoid versus boolean threshold circuits”, Proc. of the 32nd Annual IEEE Symp. on Foundations of Computer Science, 1991, 767–776.Google Scholar
  30. [MT 89]
    W. Maass, G. Turán, “ On the complexity of learning from counterexamples” (extended abstract), Proc. of the 30th Annual IEEE Symp. on Foundations of Computer Science, 1989, 262–267.Google Scholar
  31. [MT 92]
    W. Maass, G. Turán, “Lower bounds and separation results for on-line learning models”, Machine Learning, vol. 9, 1992, 107–145.MATHGoogle Scholar
  32. [MT 93]
    W. Maass, G. Turán, “Algorithms and lower bounds for on-line learning of geometrical concepts”, to appear in Machine Learning.Google Scholar
  33. [MT 94]
    W. Maass, G. Turán, “How fast can a threshold gate learn?”, in: Computational Learning Theory and Natural Learning Systems: Constraints and Prospects, G. Drastal, S. J. Hanson and R. Rivest eds., MIT Press, to appear.Google Scholar
  34. [MS]
    M. MacIntyre, E. D. Sontag, “Finiteness results for sigmoidal neural networks”, Proc. of the 25th Annual ACM Symposium on the Theory of Computing, 1993, 325–334.Google Scholar
  35. [Mi]
    J. Milnor, “On the Betti numbers of real varieties”, Proc. of the American Math. Soc, vol. 15, 1964, 275–280.MathSciNetMATHCrossRefGoogle Scholar
  36. [MP]
    M. Minsky, S. Papert, “Perceptrons: an introduction to computational geometry” MIT Press (Cambridge, 1988).MATHGoogle Scholar
  37. [MTT]
    S. Muroga, I. Todo, S. Takasu, “Theory of majority decision elements”, J. Franklin Inst, vol. 271, 1961, 376–418.MathSciNetMATHCrossRefGoogle Scholar
  38. [Mu]
    S. Muroga, “Threshold Logic and its Applications, Wiley, New York 1971.MATHGoogle Scholar
  39. [N]
    E. I. Neciporuk, “The synthesis of networks from threshold elements”, Probl. Kibern. No. 11, 1964, 49–62; engl. translation in: Autom. Expr., vol. 7, No. 1, 1964, 35-39.Google Scholar
  40. [PS]
    C. H. Papadimitriou, K. Steiglitz, “Combinatorial Optimization: Algorithms and Complexity”, Prentice Hall (Englewood Cliffs, 1982).MATHGoogle Scholar
  41. [P]
    D. Pollard, “Empirical Processes: Theory and Applications”, NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 2, 1990.Google Scholar
  42. [R]
    J. Renegar, “On the computational complexity and geometry of the first order theory of the reals, Part I”, J. of Symbolic Computation, vol. 13, 1992, 255–299.MathSciNetMATHCrossRefGoogle Scholar
  43. [Ro]
    F. Rosenblatt, “Principles of Neurodynamics” Spartan Books, New York, 1962.MATHGoogle Scholar
  44. [RM]
    D. E. Rumelhart, J. L. McClelland, “Parallel Distributed Processing”, vol. 1, MIT Press (Cambridge, 1986).Google Scholar
  45. [Sa]
    A. Sakurai, “Tighter bounds of the VC-dimension of three layer networks”, Proc. of WCNN’ 93, vol. 3, 540–543.Google Scholar
  46. [S]
    E. D. Sontag, “Feedforward nets for interpolation and classification”, J. Comp. Syst. Sci., vol. 45, 1992, 20–48.MathSciNetMATHCrossRefGoogle Scholar
  47. [Va]
    P. M. Vaidya, “A new algorithm for minimizing convex functions over convex sets”, Proc. of the 30th Annual IEEE Symp. on Foundations of Computer Science, 1989, 338–343.Google Scholar
  48. [V]
    L. G. Valiant, “A theory of the learnable”, Comm. of the ACM, vol. 27, 1984, 1134–1142.MATHCrossRefGoogle Scholar
  49. [WK]
    S. M. Weiss, C. A. Kulikowski, “Computer Systems that Learn”, Morgan Kaufmann (San Mateo, 1991).Google Scholar
  50. [WD]
    R. S. Wenocur, R. M. Dudley, “Some special Vapnik-Chervonenkis classes”, Discrete Math., vol. 33, 1981, 313–318.MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1994

Authors and Affiliations

  • Wolfgang Maass
    • 1
  1. 1.Institute for Theoretical Computer ScienceTechnische Universitaet GrazAustria

Personalised recommendations