Algebraic Analysis for Singular Statistical Estimation
This paper clarifies learning efficiency of a non-regular parametric model such as a neural network whose true parameter set is an analytic variety with singular points. By using Sato’s b-function we rigorously prove that the free energy or the Bayesian stochastic complexity is asymptotically equal to λ1 log n − (m1 − 1) log log n+constant, where λ1 is a rational number, m1 is a natural number, and n is the number of training samples. Also we show an algorithm to calculate λ1 and m1 based on the resolution of singularity. In regular models, 2λ1 is equal to the number of parameters and m1 = 1, whereas in non-regular models such as neural networks, 2λ1 is smaller than the number of parameters and m1 ≥ 1.
Unable to display preview. Download preview PDF.
- 2.Fukumizu, K.: Generalization error of linear neural networks in unidentifiable cases. In this issue.Google Scholar
- 3.Watanabe, S.: Inequalities of generalization errors for layered neural networks in Bayesian learning. Proc. of ICONIP 98 (1998) 59–62Google Scholar
- 7.Bernstein, I.N.: The analytic continuation of generalized functions with respect to a parameter. Functional Anal. Appl.6 (1972) 26–40.Google Scholar
- 8.Björk, J.E.: Rings of differential operators. Northholand (1979)Google Scholar
- 10.Gel’fand, I.M., Shilov, G.E.: Generalized functions. Academic Press, (1964).Google Scholar
- 11.Watanabe, S.: Algebraic analysis for neural network learning. Proc. of IEEE SMC Symp., 1999, to appear.Google Scholar
- 12.Watanabe, S.: On the generalization error by a layered statistical model with Bayesian estimation. IEICE Trans. J81-A (1998) 1442–1452. (The English version is to appear in Elect. and Comm. in Japan. John Wiley and Sons)Google Scholar
- 14.Hörmander, L.: An introduction to complex analysis in several variables. Van Nostrand. (1966)Google Scholar