Skip to main content

Occam’s Razor for Parametric Families and Priors on the Space of Distributions

  • Conference paper
Maximum Entropy and Bayesian Methods

Part of the book series: Fundamental Theories of Physics ((FTPH,volume 79))

  • 923 Accesses

Abstract

I define the razor, a natural measure of the complexity of a parametric family of distributions relative to a given true distribution. I show that empirical approximations of this quantity may be used to implement parsimonious inference schemes that favour simple models. In particular, the razor is seen to give finer classifications of model families than the Minimum Description Length principle as advocated by Rissanen. In a certain strong sense it is shown that the logarithm of the Bayesian posterior probability of a model family given a collection of data converges in the large sample limit to the logarithm of the razor of the family. This provides the most accurate asymptotics to date for Bayesian parametric inference. These results are derived by treating parametric families as manifolds embedded in the space of probability distributions. In the course of deriving a suitable integration measure on such manifolds, it is shown that, in a certain sense, a uniform prior on the space of probability distributions would induce a Jeffreys’ Prior on the parameters of a parametric family of distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. V. Balasubramanian, “A geometric formulation of Occam’s razor for inference of parametric distributions.” Available as preprint number adap-org/9601001 from http://xyz.lanl.gov/ and as Princeton University Physics Preprint PUPT-1588, January 1996.

  2. J. Rissanen, “Fisher information and stochastic complexity.” Submitted to the IEEE Transactions of Information Theory, 1994.

    Google Scholar 

  3. J. Rissanen, “Universal coding, information, prediction and estimation,” IEEE Transactions on Information Theory, 30, pp. 629–636, July 1984.

    Article  MathSciNet  MATH  Google Scholar 

  4. J. Rissanen, “Stochastic complexity and modelling,” The Annals of Statistics, 14,(3), pp. 1080–1100, 1986.

    Article  MathSciNet  MATH  Google Scholar 

  5. H. Jeffreys, Theory of Probability, Oxford University Press, 3rd ed., 1961.

    Google Scholar 

  6. P. Lee, Bayesian Statistics: An Introduction, Oxford University Press, 1989.

    Google Scholar 

  7. S. Amari, Differential Geometrical Methods in Statistics, Springer-Verlag, 1985.

    Google Scholar 

  8. S. Amari, 0. Barndorff-Nielsen, R. Kass, S. Lauritzen, and C. Rao, Differential Geometry in Statistical Inference, vol. 10, Institute of Mathematical Statistics Lecture Note-Monograph Series, 1987.

    Google Scholar 

  9. B. Clarke and A.R. Barron, “Information-theoretic asymptotics of bayes methods,” IEEE Transactions on Information Theory, 36, pp. 453–471, May 1990.

    Article  MathSciNet  MATH  Google Scholar 

  10. A.R. Barron and T. Cover, “Minimum complexity density estimation,” IEEE Transactions on Information Theory, 37, pp. 1034–1054, July 1991.

    Article  MathSciNet  MATH  Google Scholar 

  11. C. Wallace and P. Freeman, “Estimation and inference by compact coding,” Journal of The Royal Statistical Society, 49, pp. 240–265, July 1987.

    MathSciNet  MATH  Google Scholar 

  12. T. Cover and J. Thomas, Elements of Information Theory, Wiley, New York, 1991.

    Book  MATH  Google Scholar 

  13. J. Conway and N. Sloane, Sphere Packings, Lattices and Groups, Springer-Verlag, New York, 2nd ed., 1988.

    MATH  Google Scholar 

  14. A. Barron, Logically Smooth Density Estimation. PhD thesis, Stanford University, August 1985.

    Google Scholar 

  15. K. Yamanishi, “A decision-theoretic extension of stochastic complexity and its applications to learning.” Submitted to the IEEE Transactions of Information Theory, June 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer Science+Business Media Dordrecht

About this paper

Cite this paper

Balasubramanian, V. (1996). Occam’s Razor for Parametric Families and Priors on the Space of Distributions. In: Hanson, K.M., Silver, R.N. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 79. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5430-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-5430-7_33

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-6284-8

  • Online ISBN: 978-94-011-5430-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics