Abstract
I define the razor, a natural measure of the complexity of a parametric family of distributions relative to a given true distribution. I show that empirical approximations of this quantity may be used to implement parsimonious inference schemes that favour simple models. In particular, the razor is seen to give finer classifications of model families than the Minimum Description Length principle as advocated by Rissanen. In a certain strong sense it is shown that the logarithm of the Bayesian posterior probability of a model family given a collection of data converges in the large sample limit to the logarithm of the razor of the family. This provides the most accurate asymptotics to date for Bayesian parametric inference. These results are derived by treating parametric families as manifolds embedded in the space of probability distributions. In the course of deriving a suitable integration measure on such manifolds, it is shown that, in a certain sense, a uniform prior on the space of probability distributions would induce a Jeffreys’ Prior on the parameters of a parametric family of distributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
V. Balasubramanian, “A geometric formulation of Occam’s razor for inference of parametric distributions.” Available as preprint number adap-org/9601001 from http://xyz.lanl.gov/ and as Princeton University Physics Preprint PUPT-1588, January 1996.
J. Rissanen, “Fisher information and stochastic complexity.” Submitted to the IEEE Transactions of Information Theory, 1994.
J. Rissanen, “Universal coding, information, prediction and estimation,” IEEE Transactions on Information Theory, 30, pp. 629–636, July 1984.
J. Rissanen, “Stochastic complexity and modelling,” The Annals of Statistics, 14,(3), pp. 1080–1100, 1986.
H. Jeffreys, Theory of Probability, Oxford University Press, 3rd ed., 1961.
P. Lee, Bayesian Statistics: An Introduction, Oxford University Press, 1989.
S. Amari, Differential Geometrical Methods in Statistics, Springer-Verlag, 1985.
S. Amari, 0. Barndorff-Nielsen, R. Kass, S. Lauritzen, and C. Rao, Differential Geometry in Statistical Inference, vol. 10, Institute of Mathematical Statistics Lecture Note-Monograph Series, 1987.
B. Clarke and A.R. Barron, “Information-theoretic asymptotics of bayes methods,” IEEE Transactions on Information Theory, 36, pp. 453–471, May 1990.
A.R. Barron and T. Cover, “Minimum complexity density estimation,” IEEE Transactions on Information Theory, 37, pp. 1034–1054, July 1991.
C. Wallace and P. Freeman, “Estimation and inference by compact coding,” Journal of The Royal Statistical Society, 49, pp. 240–265, July 1987.
T. Cover and J. Thomas, Elements of Information Theory, Wiley, New York, 1991.
J. Conway and N. Sloane, Sphere Packings, Lattices and Groups, Springer-Verlag, New York, 2nd ed., 1988.
A. Barron, Logically Smooth Density Estimation. PhD thesis, Stanford University, August 1985.
K. Yamanishi, “A decision-theoretic extension of stochastic complexity and its applications to learning.” Submitted to the IEEE Transactions of Information Theory, June 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer Science+Business Media Dordrecht
About this paper
Cite this paper
Balasubramanian, V. (1996). Occam’s Razor for Parametric Families and Priors on the Space of Distributions. In: Hanson, K.M., Silver, R.N. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 79. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5430-7_33
Download citation
DOI: https://doi.org/10.1007/978-94-011-5430-7_33
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-6284-8
Online ISBN: 978-94-011-5430-7
eBook Packages: Springer Book Archive