Skip to main content

Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition

  • Conference paper

Part of the NATO ASI Series book series (NATO ASI F,volume 68)

Abstract

We are concerned with feed-forward non-linear networks (multi-layer perceptrons, or MLPs) with multiple outputs. We wish to treat the outputs of the network as probabilities of alternatives (e.g. pattern classes), conditioned on the inputs. We look for appropriate output non-linearities and for appropriate criteria for adaptation of the parameters of the network (e.g. weights). We explain two modifications: probability scoring, which is an alternative to squared error minimisation, and a normalised exponential (softmax) multi-input generalisation of the logistic non-linearity. The two modifications together result in quite simple arithmetic, and hardware implementation is not difficult either. The use of radial units (squared distance instead of dot product) immediately before the softmax output stage produces a network which computes posterior distributions over class labels based on an assumption of Gaussian within-class distributions. However the training, which uses cross-class information, can result in better performance at class discrimination than the usual within-class training method, unless the within-class distribution assumptions are actually correct.

Keywords

  • Hide Markov Model
  • Posterior Distribution
  • Class Label
  • Boltzmann Machine
  • Statistical Pattern Recognition

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-76153-9_28
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-3-642-76153-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.00
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D R Cox and H D Millar. The Theory of stochastic processes. Methuen, 1965.

    MATH  Google Scholar 

  2. T J Seinowski and C R Rosenberg. NETtalk: A parallel network that learns to read aloud. Technical Report JHU/EECS-86/01, Johns Hopkins U. EE&CS, 1986.

    Google Scholar 

  3. L Gillick. Probability scores for backpropagation networks. July 1987. Personal communication.

    Google Scholar 

  4. G E Hinton. Connectionist Learning Procedures. Technical Report CMU-CS-87–115, Carnegie Mellon University Computer Science Department, June 1987.

    Google Scholar 

  5. E B Baum and F Wilczek. Supervised learning of probability distributions by neural networks. In D Anderson, editor,Neural Information Processing Systems, pages 52–61, Am. Inst, of Physics, 1988.

    Google Scholar 

  6. S Solla, E Levin, and M Fleisher. Accelerated learning in layered neural networks. Complex Systems, January 1989.

    Google Scholar 

  7. G.E. Hinton, T.J. Sejnowski, and D.H. Ackley. Boltzmann machines: constraint satisfaction networks that learn. Technical report CMU-CS-84–119, Carnegie-Mellon University, May 1984.

    Google Scholar 

  8. E Yair and A Gersho. The Boltzmann Perceptron Network: a soft classifier. Technical Report CIPR TR 88–11, Center for Information Processing Research, Dept. of E&CE, UCSB, November 1988.

    Google Scholar 

  9. L R Bahl, P F Brown, P V de Souza, and R L Mercer. Maximum mutual information estimation of hidden Markov model parameters. In Proc. IEEE ICASSP86, pages 49–52, 1986.

    Google Scholar 

  10. A J Viterbi. Principles of Digital Communication and Coding. McGraw-Hill, 1979.

    MATH  Google Scholar 

  11. G E Peterson and H L Barney. Control methods used in a study of vowels. J. Acoust. Soc. Amer., 24(2):175–184, March 1952.

    CrossRef  Google Scholar 

  12. W M Huang and R P Lippmann. Neural net and traditional classifiers. In D Anderson, editor, Neural Information Processing Systems, pages 387–396, Am. Inst, of Physics, 1988.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 1990 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bridle, J.S. (1990). Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition. In: Soulié, F.F., Hérault, J. (eds) Neurocomputing. NATO ASI Series, vol 68. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-76153-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-76153-9_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-76155-3

  • Online ISBN: 978-3-642-76153-9

  • eBook Packages: Springer Book Archive