Skip to main content
Log in

Classification With Support Vector Machines and Kolmogorov-Smirnov Bounds

  • Published:
Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Abstract

This article presents a new statistical inference method for classification. Instead of minimizing a loss function that solely takes residuals into account, it uses the Kolmogorov-Smirnov bounds for the cumulative distribution function of the residuals, as such taking conservative bounds for the underlying probability distribution for the population of residuals into account. The loss functions considered are based on the theory of support vector machines. Parameters for the discriminant functions are computed using a minimax criterion, and for a wide range of popular loss functions, the computations are shown to be feasible based on new optimization results presented in this article. The method is illustrated in examples, both with small simulated data sets and with real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Augustin, T., and F. P. A. Coolen. 2004. Nonparametric predictive inference and interval probability. J. Stat. Plan. Inference, 124, 251–272.

    Article  MathSciNet  Google Scholar 

  • Berger, J. O. 1985. Statistical decision theory and Bayesian analysis. New York, NY: Springer-Verlag.

    Book  Google Scholar 

  • Coolen, F. P. A. 2011. Nonparametric predictive inference. In International encyclopedia of statistical science, ed. M. Lovric, 968–970. Berlin: Springer.

    Chapter  Google Scholar 

  • Coolen, F. P. A., M. C. Troffaes, and T. Augustin. 2011. Imprecise probability. In International encyclopedia of statistical science, ed. M. Lovric, 645–648. Berlin: Springer.

    Chapter  Google Scholar 

  • Dempster, A. P. 1967. Upper and lower probabilities induced by a multi-valued mapping. Ann. Math. Stat., 38: 325–339.

    Article  Google Scholar 

  • Destercke, S., D. Dubois, and E. Chojnacki. 2008. Unifying practical uncertainty representations—I: Generalized p-boxes. Int. J. Approx. Reasoning, 49, 649–663.

    Article  MathSciNet  Google Scholar 

  • Evgeniou, T., T. Poggio, M. Pontil, and A. Verri. 2002. Regularization and statistical learning theory for data analysis. Comput. Stat. Data Anal., 38, 421–432.

    Article  MathSciNet  Google Scholar 

  • Frank, A., and A. Asuncion. n.d. UCI machine learning repository. Available at https://doi.org/archive.ics.uci.edu/ml

  • Frey, J. 2009. Confidence bands for the cdf when sampling from a finite population. Comput. Stat. Data Anal., 53, 4126–4132.

    Article  MathSciNet  Google Scholar 

  • Gilboa, I., and D. Schmeidler. 1989. Maxmin expected utility with non-unique prior. J. Math. Econ., 18, 141–153.

    Article  Google Scholar 

  • Hastie, T., R. Tibshirani, and J. Friedman. 2001. The elements of statistical learning: Data mining, inference and prediction. New York, NY: Springer.

    Book  Google Scholar 

  • Johnson, N. L., and F. Leone. 1964. Statistics and experimental design in engineering and the physical sciences, vol. 1. New York, NY: Wiley.

    MATH  Google Scholar 

  • Kriegler, E., and H. Held. 2005. Utilizing belief functions for the estimation of future climate change. Int. J. Approx. Reasoning, 39, 185–209.

    Article  MathSciNet  Google Scholar 

  • Montgomery, V. J., F. P. A. Coolen, and A. D. M. Hart. 2009. Bayesian probability boxes in risk assessment. J. Stat. Theory Pract., 3, 69–83.

    Article  MathSciNet  Google Scholar 

  • Mulier, F. M., and V. Cherkassky. 2007. Learning from data: Concepts, theory, and methods. Hoboken, NJ: Wiley.

    MATH  Google Scholar 

  • Petit-Renaud, S., and T. Denoeux. 2004. Nonparametric regression analysis of uncertain and imprecise data using belief functions. Int. J. Approx. Reasoning, 35, 1–28.

    Article  MathSciNet  Google Scholar 

  • Quaeghebeur, E., and G. de Cooman. 2005. Imprecise probability models for inference in exponential families. In Proceedings of the 4rd International Symposium on Imprecise Probabilities and Their Applications, ISIPTA’05, ed. J.-M. Bernard, T. Seidenfeld, and M. Zaffalon, 287–296. Pittsburgh, PA: Carnegie Mellon University, July.

    Google Scholar 

  • R Development Core Team. 2005. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

    Google Scholar 

  • Robert, C. P. 1994. The Bayesian choice. New York, NY: Springer.

    Book  Google Scholar 

  • Scholkopf, B., and A. J. Smola. 2002. Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press.

    Google Scholar 

  • Shafer, G. 1976. A mathematical theory of evidence. Princeton, NJ: Princeton University Press.

    MATH  Google Scholar 

  • Smola, A. J., and B. Scholkopf. 2004. A tutorial on support vector regression. Stat. Comput., 14, 199–222.

    Article  MathSciNet  Google Scholar 

  • Tikhonov, A. N., and V. Y. Arsenin. 1977. Solutions of ill-posed problems. Washington, DC: W. H. Winston.

    MATH  Google Scholar 

  • Troffaes, M. C. M. 2007. Decision making under uncertainty using imprecise probabilities. Int. J. Approx. Reasoning, 45, 17–29.

    Article  MathSciNet  Google Scholar 

  • Utkin, L. V. 2010. Regression analysis using the imprecise Bayesian normal model. Int. J. Data Anal. Techniques Strategies, 2, 356–372.

    Article  Google Scholar 

  • Utkin, L. V., and T. Augustin. 2005. Efficient algorithms for decision making under partial prior information and general ambiguity attitudes. In Proceedings of the 4th International Symposium on Imprecise Probabilities and Their Applications, ISIPTA’05, ed. T. Seidenfeld, F. G. Cozman, and R. Nau, 349–358, Pittsburgh, PA: Carnegie Mellon University, SIPTA, July.

    Google Scholar 

  • Utkin, L. V., and F. P. A. Coolen. 2011. On reliability growth models using Kolmogorov-Smirnov bounds. Int. J. Performability Eng., 7, 5–19.

    Google Scholar 

  • Vapnik, V. 1998. Statistical learning theory. New York, NY: Wiley.

    MATH  Google Scholar 

  • Walley, P. 1991. Statistical reasoning with imprecise probabilities. London, UK: Chapman and Hall.

    Book  Google Scholar 

  • Walley, P. 1996. Inferences from multinomial data: Learning about a bag of marbles (with discussion). J. R. Stat. Soc. Ser. B 58, 3–57.

    MathSciNet  MATH  Google Scholar 

  • Walley, P. 1996. Measures of uncertainty in expert systems. Artif. Intelligence, 83, 1–58.

    Article  MathSciNet  Google Scholar 

  • Walter, G., and T. Augustin, and A. Peters. 2007. Linear regression analysis under sets of conjugate priors. In Proceedings of the Fifth International Symposium on Imprecise Probabilities and Their Applications, ed. G. de Cooman, J. Vejnarova, and M. Zaffalon, 445–455. Prague, Czech Republic.

  • Wasserman, L. 2006. All of nonparametric statistics. New York, NY: Springer.

    MATH  Google Scholar 

  • Webb, A. R. 2002. Statistical pattern recognition, 2nd ed. New York, NY: Wiley.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank P. A. Coolen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Utkin, L.V., Coolen, F.P.A. Classification With Support Vector Machines and Kolmogorov-Smirnov Bounds. J Stat Theory Pract 8, 297–318 (2014). https://doi.org/10.1080/15598608.2013.788985

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1080/15598608.2013.788985

Keywords

Navigation