Algebraic Analysis for Singular Statistical Estimation
This paper clarifies learning efficiency of a non-regular parametric model such as a neural network whose true parameter set is an analytic variety with singular points. By using Sato’s b-function we rigorously prove that the free energy or the Bayesian stochastic complexity is asymptotically equal to λ 1 log n − (m 1 − 1) log log n+constant, where λ 1 is a rational number, m 1 is a natural number, and n is the number of training samples. Also we show an algorithm to calculate λ 1 and m 1 based on the resolution of singularity. In regular models, 2λ 1 is equal to the number of parameters and m 1 = 1, whereas in non-regular models such as neural networks, 2λ 1 is smaller than the number of parameters and m 1 ≥ 1.
KeywordsGeneralization Error Regular Model Algebraic Analysis Layered Neural Network Statistical Estimation Error
Unable to display preview. Download preview PDF.
- 2.Fukumizu, K.: Generalization error of linear neural networks in unidentifiable cases. In this issue.Google Scholar
- 3.Watanabe, S.: Inequalities of generalization errors for layered neural networks in Bayesian learning. Proc. of ICONIP 98 (1998) 59–62Google Scholar
- 7.Bernstein, I.N.: The analytic continuation of generalized functions with respect to a parameter. Functional Anal. Appl.6 (1972) 26–40.Google Scholar
- 8.Björk, J.E.: Rings of differential operators. Northholand (1979)Google Scholar
- 10.Gel’fand, I.M., Shilov, G.E.: Generalized functions. Academic Press, (1964).Google Scholar
- 11.Watanabe, S.: Algebraic analysis for neural network learning. Proc. of IEEE SMC Symp., 1999, to appear.Google Scholar
- 12.Watanabe, S.: On the generalization error by a layered statistical model with Bayesian estimation. IEICE Trans. J81-A (1998) 1442–1452. (The English version is to appear in Elect. and Comm. in Japan. John Wiley and Sons)Google Scholar
- 14.Hörmander, L.: An introduction to complex analysis in several variables. Van Nostrand. (1966)Google Scholar