Algebraic Analysis for Singular Statistical Estimation

  • Sumio Watanabe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1720)

Abstract

This paper clarifies learning efficiency of a non-regular parametric model such as a neural network whose true parameter set is an analytic variety with singular points. By using Sato’s b-function we rigorously prove that the free energy or the Bayesian stochastic complexity is asymptotically equal to λ1 log n − (m1 − 1) log log n+constant, where λ1 is a rational number, m1 is a natural number, and n is the number of training samples. Also we show an algorithm to calculate λ1 and m1 based on the resolution of singularity. In regular models, 2λ1 is equal to the number of parameters and m1 = 1, whereas in non-regular models such as neural networks, 2λ1 is smaller than the number of parameters and m1 ≥ 1.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hagiwara, K., Toda, N., Usui, S.,: On the problem of applying AIC to determine the structure of a layered feed-forward neural network. Proc. of IJCNN Nagoya Japan. 3 (1993) 2263–2266CrossRefGoogle Scholar
  2. 2.
    Fukumizu, K.: Generalization error of linear neural networks in unidentifiable cases. In this issue.Google Scholar
  3. 3.
    Watanabe, S.: Inequalities of generalization errors for layered neural networks in Bayesian learning. Proc. of ICONIP 98 (1998) 59–62Google Scholar
  4. 4.
    Levin, E., Tishby, N., Solla, S.A.: A statistical approaches to learning and generalization in layered neural networks. Proc. of IEEE 78(10) (1990) 1568–1674CrossRefGoogle Scholar
  5. 5.
    Amari, S., Fujita, N., Shinomoto, S.: Four Types of Learning Curves. Neural Computation 4(4) (1992) 608–618CrossRefGoogle Scholar
  6. 6.
    Sato, M., Shintani, T.: On zeta functions associated with prehomogeneous vector space. Anals. of Math., 100 (1974) 131–170CrossRefMathSciNetGoogle Scholar
  7. 7.
    Bernstein, I.N.: The analytic continuation of generalized functions with respect to a parameter. Functional Anal. Appl.6 (1972) 26–40.Google Scholar
  8. 8.
    Björk, J.E.: Rings of differential operators. Northholand (1979)Google Scholar
  9. 9.
    Kashiwara, M.: B-functions and holonomic systems. Inventions Math. 38 (1976) 33–53.MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Gel’fand, I.M., Shilov, G.E.: Generalized functions. Academic Press, (1964).Google Scholar
  11. 11.
    Watanabe, S.: Algebraic analysis for neural network learning. Proc. of IEEE SMC Symp., 1999, to appear.Google Scholar
  12. 12.
    Watanabe, S.: On the generalization error by a layered statistical model with Bayesian estimation. IEICE Trans. J81-A (1998) 1442–1452. (The English version is to appear in Elect. and Comm. in Japan. John Wiley and Sons)Google Scholar
  13. 13.
    Atiyah, M.F.: Resolution of Singularities and Division of Distributions. Comm. Pure and Appl. Math. 13 (1970) 145–150MathSciNetCrossRefGoogle Scholar
  14. 14.
    Hörmander, L.: An introduction to complex analysis in several variables. Van Nostrand. (1966)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Sumio Watanabe
    • 1
  1. 1.P&I Lab.Tokyo Institute of TechnologyYokohamaJapan

Personalised recommendations