Studies on ensemble methods for classification suffer from the difficulty of modeling the complementary strengths of the components. Kleinberg’s theory of stochastic discrimination (SD) addresses this rigorously via mathematical notions of enrichment, uniformity, and projectability of a model ensemble. We explain these concepts via a very simple numerical example that captures the basic principles of the SD theory and method. We focus on a fundamental symmetry in point set covering that is the key observation leading to the foundation of the theory. We believe a better understanding of the SD method will lead to developments of better tools for analyzing other ensemble methods.


Feature Space Machine Intelligence Model Ensemble Ensemble Method Ensemble Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Berlind, R.: An Alternative Method of Stochastic Discrimination with Applications to Pattern Recognition, Doctoral Dissertation, Department of Mathematics, State University of New York at Buffalo (1994)Google Scholar
  2. 2.
    Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)MATHMathSciNetGoogle Scholar
  3. 3.
    Chen, D.: Estimates of Classification Accuracies for Kleinberg’s Method of Stochastic Discrimination in Pattern Recognition, Doctoral Dissertation, Department of Mathematics, State University of New York at Buffalo (1998)Google Scholar
  4. 4.
    Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)MATHGoogle Scholar
  5. 5.
    Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy, July 3-6, pp. 148–156 (1996)Google Scholar
  6. 6.
    Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-12(10), 993–1001 (1990)CrossRefGoogle Scholar
  7. 7.
    Ho, T.K.: Random Decision Forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, August 14-18, pp. 278–282 (1995)Google Scholar
  8. 8.
    Ho, T.K.: Multiple classifier combination: Lessons and next steps. In: Kandel, A., Bunke, H. (eds.) Hybrid Methods in Pattern Recognition. World Scientific, Singapore (2002)Google Scholar
  9. 9.
    Ho, T.K., Kleinberg, E.M.: Building Projectable Classifiers of Arbitrary Complexity. In: Proceedings of the 13th International Conference on Pattern Recognition, Vienna, Austria, August 25-30, pp. 880–885 (1996)Google Scholar
  10. 10.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  11. 11.
    Ho, T.K.: Nearest Neighbors in Random Subspaces. In: Proceedings of the Second International Workshop on Statistical Techniques in Pattern Recognition, Sydney, Australia, August 11-13, pp. 640–648 (1998)Google Scholar
  12. 12.
    Ho, T.K., Hull, J.J., Srihari, S.N.: Decision Combination in Multiple Classifier Systems. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-16(1), 66–75 (1994)Google Scholar
  13. 13.
    Huang, Y.S., Suen, C.Y.: A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-17(1), 90–94 (1995)CrossRefGoogle Scholar
  14. 14.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-20(3), 226–239 (1998)CrossRefGoogle Scholar
  15. 15.
    Kleinberg, E.M.: Stochastic Discrimination. Annals of Mathematics and Artificial Intelligence 1, 207–239 (1990)MATHCrossRefGoogle Scholar
  16. 16.
    Kleinberg, E.M.: An overtraining-resistant stochastic modeling method for pattern recognition. Annals of Statistics 4(6), 2319–2349 (1996)MathSciNetGoogle Scholar
  17. 17.
    Kleinberg, E.M.: On the algorithmic implementation of stochastic discrimination. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-22(5), 473–490 (2000)CrossRefGoogle Scholar
  18. 18.
    Kleinberg, E.M.: A mathematically rigorous foundation for supervised learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 67–76. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  19. 19.
    Lam, L., Suen, C.Y.: Application of majority voting to pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics SMC-27(5), 553–568 (1997)Google Scholar
  20. 20.
    Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, Heidelberg (1982)MATHGoogle Scholar
  21. 21.
    Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)MATHGoogle Scholar
  22. 22.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Tin Kam Ho
    • 1
  1. 1.Bell LabsLucent Technologies 

Personalised recommendations