Skip to main content

Comparison between various regression depth methods and the support vector machine to approximate the minimum number of misclassifications

Summary

The minimum number of misclassifications achievable with affine hyperplanes on a given set of labeled points is a key quantity in both statistics and computational learning theory. However, determining this quantity exactly is NP-hard, c.f. Höffgen, Simon and van Horn (1995). Hence, there is a need to find reasonable approximation procedures. This paper introduces two new approaches to approximating the minimum number of misclassifications achievable with affine hyperplanes. Both approaches are modifications of the regression depth method proposed by Rousseeuw and Hubert (1999) for linear regression models. Our algorithms are compared to the existing regression depth algorithm (c.f. Christmann and Rousseeuw, 1999) for various data sets. We also used a support vector machine approach, as proposed by Vapnik (1998), as a reference method.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2

References

  • Albert, A. and Anderson, J.A. (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika 71, 1–10.

    MathSciNet  Article  Google Scholar 

  • Boser, B, Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. Proceedings of the 5th Annual Workshop on Computational Learning Theory, 144–152.

  • Burges, C.J.C. (1998). A tutorial on support vector machines for pattern recognition. Knowledge Discovery and Data Mining. 2, 1–43.

    Article  Google Scholar 

  • Christmann, A. and Roussseeuw, P.J. (1999). Measuring overlap in logistic regression. Technical Report, University of Dortmund, SFB 475. To appear in: Computational Statistics and Data Analysis. https://doi.org/www.statistik.uni-dortmund.de/sfb475/berichte/tr25-99-software.zip

  • Efron, B. (1986). Double exponential families and their use in generalized linear regression. J. Amer. Statist. Assoc., 81, 709–721.

    MathSciNet  Article  Google Scholar 

  • Finney, D.J. (1947). The estimation from individual records of the relationship between dose and quantal response. Biometrika 34, 320–334.

    Article  Google Scholar 

  • Hermans, J. and Habbema, J.D.F. (1975). Comparison of five methods to estimate posterior probabilities. EDV in Medizin und Biologie 6, 14–19.

    Google Scholar 

  • Höffgen, K.U., Simon, H.-U., van Horn, K.S.} (1995). Robust Trainability of Single Neurons. J. Computer and System Sciences, 50, 114–125.

    MathSciNet  Article  Google Scholar 

  • Hosmer, D.W. and Lemeshow, S. (1989). Applied Logistic Regression. Wiley, New York.

    MATH  Google Scholar 

  • Jaeger, H.J., Mair, T., Geller, M., Kinne, R.K., Christmann, A., Mathias, K.D. (1997). A physiologie in vitro model of the inferior vena cava with a computer-controlled flow system for testing of inferior vena cava filters. Investigative Radiology 32, 511–522.

    Article  Google Scholar 

  • Jaeger, H.J., Kolb, S., Mair, T., Geller, M., Christmann, A., Kinne, R.K., Mathias, K.D. (1998). In vitro model for the evaluation of inferior vena cava filters: effect of experimental parameters on thrombus-capturing efficacy of the Vena Tech-LGM Filter. Journal of Vascular and Interventional Radiology 9, 295–304.

    Article  Google Scholar 

  • Joachims, T. (1999). Making large-Scale SVM Learning Practical. In: B. Schölkopf, C. Burges, A. Smola (ed.), Advances in Kernel Methods — Support Vector Learning, MIT-Press. https://doi.org/www-ai.cs.uni-dortmund.de/svm_light

  • Künsch, H.R., Stefanski, L.A. and Carroll, R.J. (1989). Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J. Amer. Statist. Assoc. 84, 460–466.

    MathSciNet  MATH  Google Scholar 

  • Lee, E.T. (1974). A computer program for linear logistic regression analysis. Computer Programs in Biomedicine, 80–92.

    Article  Google Scholar 

  • Novikoff, A. (1962). On convergence proofs on perceptrons. Proceedings of the Symposium on the Mathematical Theory of Automata, Vol XII, pp. 615–622.

    MathSciNet  Google Scholar 

  • Osuna, E., Freund, R., and Griosi, F. (1997). An improved algorithm for training support vector machines. Proceedings of the IEEE Workshop on Neural Networks for Signal Processing, 276–285.

  • Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization. In: B. Schölkopf, C. Burges, A. Smola (ed.), Advances in Kernel Methods — Support Vector Learning, MIT-Press.

  • Pires, A.M. (1995). Análise Discriminante: Novos Métodos Robustos de Estimacão. Ph.D. thesis, Technical University of Lisbon, Portugal.

    Google Scholar 

  • Pregibon, D. (1981). Logistic regression diagnostics. Ann. Statist. 9, 705–724.

    MathSciNet  Article  Google Scholar 

  • Riedwyl, H. (1997). Lineare Regression und Verwandtes. Birkhäuser, Basel.

    Book  Google Scholar 

  • Rosenblatt, F. (1962). Principles of Neurodynamics. Spartan. New York.

  • Rousseeuw, P.J. and Hubert, M. (1999). Regression Depth. J. Amer. Statist. Assoc., 94, 388–433.

    MathSciNet  Article  Google Scholar 

  • Rousseeuw, P.J. and Struyf, A. (1998). Computing location depth and regression depth in higher dimensions. Statistics and Computing 8, 193–203.

    Article  Google Scholar 

  • Santner, T.J. and Duffy, D.E. (1986). A note on A. Albert and J.A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika 73, 755–758.

    MathSciNet  Article  Google Scholar 

  • Smola, A.J. (1998). Learning with Kernels. Ph.D. thesis, TU Berlin, GMD Research Series No. 25. https://doi.org/svm.first.gmd.de/software/logosurvey.html

  • Vapnik, V. (1998). Statistical Learning Theory. Wiley, New York.

    MATH  Google Scholar 

Download references

Acknowledgements

The authors thank Prof. P.J. Rousseeuw for helpful discussions, Prof. R.J. Carroll for making available the Food Stamp data set, and Dr. H.J. Jaeger for making available the IVC data set.

Author information

Authors and Affiliations

Authors

Additional information

The financial support of the Deutsche Forschungsgemeinschaft (SFB 475, “Reduction of complexity in multivariate data structures”) is gratefully acknowledged.

Appendix

Appendix

In the following we give pseudo-code for the heuristic method.

figure a

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Christmann, A., Fischer, P. & Joachims, T. Comparison between various regression depth methods and the support vector machine to approximate the minimum number of misclassifications. Computational Statistics 17, 273–287 (2002). https://doi.org/10.1007/s001800200106

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001800200106

Keywords

  • Linear discriminant analysis
  • Logistic regression
  • Overlap
  • Regression depth
  • Separation
  • Support vector machine