Skip to main content

Autoclass — A Bayesian Approach to Classification

  • Conference paper

Part of the book series: Fundamental Theories of Physics ((FTPH,volume 70))

Abstract

We describe a Bayesian approach to the unsupervised discovery of classes in a set of cases, sometimes called finite mixture separation or clustering. The main difference between clustering and our approach is that we search for the “best” set of class descriptions rather than grouping the cases themselves. We describe our classes in terms of probability distribution or density functions, and the locally maximal posterior probability parameters. We rate our classifications with an approximate posterior probability of the distribution function w.r.t. the data, obtained by marginalizing over all the parameters. Approximation is necessitated by the computational complexity of the joint probability, and our marginalization is w.r.t. a local maxima in the parameter space. This posterior probability rating allows direct comparison of alternate density functions that differ in number of classes and/or individual class density functions.

We discuss the rationale behind our approach to classification. We give the mathematical development for the basic mixture model, describe the approximations needed for computational tractability, give some specifics of models for several common attribute types, and describe some of the results achieved by the AutoClass program..

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Aitchison and and J. A. C. Brown. The Lognormal Distribution. University Press, Cambridge, 1957.

    MATH  Google Scholar 

  2. D. M. Boulton and C. S. Wallace. An information Measure of Hierarchic Classification. Computer Journal, 16 (3), pp 57–63,1973.

    Article  MATH  Google Scholar 

  3. G. E. P. Box and G. C. Tiao. Bayesian Inference in Statistical Analysis. Addison-Wesley, Reading, Mass. 1973. John Wiley & Sons, New York, 1992.

    Google Scholar 

  4. G. E. P. Box and G. C. Tiao. Bayesian Inference in Statistical Analysis. Addison-Wesley, Reading, Mass. 1973. John Wiley & Sons, New York, 1992.

    Google Scholar 

  5. P. Cheeseman, J. Stutz, M. Self, W. Taylor, J. Goebel, K. Volk, H. Walker. Automatic Classification of Spectra From the Infrared Astronomical Satellite (IRAS). NASA Ref. Pubi. #1217, 1989.

    Google Scholar 

  6. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1–38,1977.

    MathSciNet  MATH  Google Scholar 

  7. W. Dillon and M. Goldstein. Multivariate Analysis: Methods and Applications, chapter 3. Wiley, 1984.

    Google Scholar 

  8. B. S. Everitt and D. J. Hand. Finite Mixture Distributions. Monographs on Applied Probability and Statistics, Chapman and Hall, London, England, 1981. Extensive Bibliography.

    Google Scholar 

  9. J. Goebel, K. Volk, H. Walker, F. Gerbault, P. Cheeseman, M. Self, J. Stutz, and W. Taylor. A Bayesian classification of the IRAS LRS Atlas. Astron. Astrophys 222, L5–L8, (1989).

    Google Scholar 

  10. R. Hanson, J. Stutz, and P. Cheeseman. Bayesian Classification with correlation and inheritance. In 12th International Joint conference on Artificial Intelligence, pages 692–698, Sydney, 1991.

    Google Scholar 

  11. Thomas Loredo. The Promise of Bayesian Inference for Astrophysics. In E. Feigelson and G. Babu Eds.,Statistical Challenges in Modern Astronomy, Springer-Verlag, 1992.

    Google Scholar 

  12. K. V. Mardia, J. T. Kent, and J. M. Bibby. Multivariant Analysis. Academic Press, New York, 1979.

    Google Scholar 

  13. D. M. Titterington, A. F. M. Smith, and U. E. Makov. Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, New York, 1985.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this paper

Cite this paper

Stutz, J., Cheeseman, P. (1996). Autoclass — A Bayesian Approach to Classification. In: Skilling, J., Sibisi, S. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 70. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-0107-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-94-009-0107-0_13

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-6534-4

  • Online ISBN: 978-94-009-0107-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics