Skip to main content

On the Relationship between Classification Error Bounds and Training Criteria in Statistical Pattern Recognition

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2652))

Included in the following conference series:

Abstract

We present two novel bounds for the classification error that, at the same time, can be used as practical training criteria. Unlike the bounds reported in the literature so far, these novel bounds are based on a strict distinction between the true but unknown distribution and the model distribution, which is used in the decision rule. The two bounds we derive are the squared distance and the Kullback-Leibler distance, where in both cases the distance is computed between the true distribution and the model distribution. In terms of practical training criteria, these bounds result in the squared error criterion and the mutual information (or equivocation) criterion, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahl, L.R., Brown, P.F., de Souza, P.V., Mercer, R.L.: Maximum Mutual Information Estimation of Hidden Markov Model Parameters. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Tokyo (April 1986)

    Google Scholar 

  2. Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. J. Wiley & Sons, Chichester (1994)

    Google Scholar 

  3. Breiman, L., Friedman, J.H., Ohlsen, R.A., Stone, C.J.: Classification And Regression Trees. Wadsworth, Belmont (1984)

    Google Scholar 

  4. Casacuberta, F.: Maximum Mutual Information and Conditional Maximum Likelihood Estimation of Stochastic Regular Syntax-Directed Translation Schemes. In: Miclet, L., de la Higuera, C. (eds.) Int. Coll. on Grammatical Inference, Montpellier, France, September 1996. LNCS, pp. 282–291. Springer, Berlin (1996)

    Google Scholar 

  5. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, New York (1991)

    Google Scholar 

  6. Devroye, L., Györfi, J., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)

    Book  Google Scholar 

  7. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. J. Wiley& Sons, New York (2001)

    MATH  Google Scholar 

  8. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)

    Book  Google Scholar 

  9. Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1972)

    MATH  Google Scholar 

  10. Hellman, M.E., Raviv, J.: Probability of Errors, Equivocation and the Chernoff Bound. IEEE Trans. on Information Theory IT-16(4), 368–372 (1970)

    Article  MathSciNet  Google Scholar 

  11. Juang, B.-H., Katagiri, S.: Discriminative Learning for Minimum Error Classification. IEEE Transactions on Signal Processing 40(12), 3043–3054 (1992)

    Article  Google Scholar 

  12. Nadas, A., Nahamoo, D., Picheny, M.: On a Model-Robust Training Method for Speech Recognition. IEEE Trans. on Acoustics, Speech and Signal Processing 36(9), 1432–1435 (1988)

    Article  Google Scholar 

  13. Ney, H.: On the Probabilistic Interpretation of Neural Net Classifiers and Discriminative Training Criteria. IEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-17(2), 107–119 (1995)

    Article  Google Scholar 

  14. Normandin, Y., Cardin, R., De Mori, R.: High-Performance Connected Digit Recognition Using Maximum Mutual Information Estimation. IEEE Trans. on Speech and Audio Processing 2(2), 299–311 (1994)

    Article  Google Scholar 

  15. Patterson, J.D., Womack, B.F.: An Adaptive Pattern Classification Scheme. IEEE Trans. on Systems, Science and Cybernetics SSC-2, 62–67 (1966)

    Article  Google Scholar 

  16. Schlüter, R., Ney, H.: Model-based MCE Bound to the True Bayes’ Error. IEEE Signal Processing Letters 8(5), 131–133 (2001)

    Article  Google Scholar 

  17. Topsoe, F.: Some Inequalities for Information Divergence and Related Measures of Discrimination. IEEE Trans. on Information Theory (2003) (to appear)

    Google Scholar 

  18. Vapnik, A.N., Chervonenkis, V.Y.: On the Uniform Convergence of Relative Frequencies of Events to their Probabilities. Theory of Probability and Its Applications 16, 264–280 (1971)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ney, H. (2003). On the Relationship between Classification Error Bounds and Training Criteria in Statistical Pattern Recognition. In: Perales, F.J., Campilho, A.J.C., de la Blanca, N.P., Sanfeliu, A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2003. Lecture Notes in Computer Science, vol 2652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44871-6_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-44871-6_74

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40217-6

  • Online ISBN: 978-3-540-44871-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics