Advertisement

Journal of Statistical Theory and Practice

, Volume 8, Issue 2, pp 297–318 | Cite as

Classification With Support Vector Machines and Kolmogorov-Smirnov Bounds

  • Lev V. Utkin
  • Frank P. A. Coolen
Article

Abstract

This article presents a new statistical inference method for classification. Instead of minimizing a loss function that solely takes residuals into account, it uses the Kolmogorov-Smirnov bounds for the cumulative distribution function of the residuals, as such taking conservative bounds for the underlying probability distribution for the population of residuals into account. The loss functions considered are based on the theory of support vector machines. Parameters for the discriminant functions are computed using a minimax criterion, and for a wide range of popular loss functions, the computations are shown to be feasible based on new optimization results presented in this article. The method is illustrated in examples, both with small simulated data sets and with real-world data.

Keywords

Classification Imprecise probability Kolmogorov-Smirnov bounds Minimax Support vector machines 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Augustin, T., and F. P. A. Coolen. 2004. Nonparametric predictive inference and interval probability. J. Stat. Plan. Inference, 124, 251–272.MathSciNetCrossRefGoogle Scholar
  2. Berger, J. O. 1985. Statistical decision theory and Bayesian analysis. New York, NY: Springer-Verlag.CrossRefGoogle Scholar
  3. Coolen, F. P. A. 2011. Nonparametric predictive inference. In International encyclopedia of statistical science, ed. M. Lovric, 968–970. Berlin: Springer.CrossRefGoogle Scholar
  4. Coolen, F. P. A., M. C. Troffaes, and T. Augustin. 2011. Imprecise probability. In International encyclopedia of statistical science, ed. M. Lovric, 645–648. Berlin: Springer.CrossRefGoogle Scholar
  5. Dempster, A. P. 1967. Upper and lower probabilities induced by a multi-valued mapping. Ann. Math. Stat., 38: 325–339.CrossRefGoogle Scholar
  6. Destercke, S., D. Dubois, and E. Chojnacki. 2008. Unifying practical uncertainty representations—I: Generalized p-boxes. Int. J. Approx. Reasoning, 49, 649–663.MathSciNetCrossRefGoogle Scholar
  7. Evgeniou, T., T. Poggio, M. Pontil, and A. Verri. 2002. Regularization and statistical learning theory for data analysis. Comput. Stat. Data Anal., 38, 421–432.MathSciNetCrossRefGoogle Scholar
  8. Frank, A., and A. Asuncion. n.d. UCI machine learning repository. Available at https://doi.org/archive.ics.uci.edu/ml
  9. Frey, J. 2009. Confidence bands for the cdf when sampling from a finite population. Comput. Stat. Data Anal., 53, 4126–4132.MathSciNetCrossRefGoogle Scholar
  10. Gilboa, I., and D. Schmeidler. 1989. Maxmin expected utility with non-unique prior. J. Math. Econ., 18, 141–153.CrossRefGoogle Scholar
  11. Hastie, T., R. Tibshirani, and J. Friedman. 2001. The elements of statistical learning: Data mining, inference and prediction. New York, NY: Springer.CrossRefGoogle Scholar
  12. Johnson, N. L., and F. Leone. 1964. Statistics and experimental design in engineering and the physical sciences, vol. 1. New York, NY: Wiley.zbMATHGoogle Scholar
  13. Kriegler, E., and H. Held. 2005. Utilizing belief functions for the estimation of future climate change. Int. J. Approx. Reasoning, 39, 185–209.MathSciNetCrossRefGoogle Scholar
  14. Montgomery, V. J., F. P. A. Coolen, and A. D. M. Hart. 2009. Bayesian probability boxes in risk assessment. J. Stat. Theory Pract., 3, 69–83.MathSciNetCrossRefGoogle Scholar
  15. Mulier, F. M., and V. Cherkassky. 2007. Learning from data: Concepts, theory, and methods. Hoboken, NJ: Wiley.zbMATHGoogle Scholar
  16. Petit-Renaud, S., and T. Denoeux. 2004. Nonparametric regression analysis of uncertain and imprecise data using belief functions. Int. J. Approx. Reasoning, 35, 1–28.MathSciNetCrossRefGoogle Scholar
  17. Quaeghebeur, E., and G. de Cooman. 2005. Imprecise probability models for inference in exponential families. In Proceedings of the 4rd International Symposium on Imprecise Probabilities and Their Applications, ISIPTA’05, ed. J.-M. Bernard, T. Seidenfeld, and M. Zaffalon, 287–296. Pittsburgh, PA: Carnegie Mellon University, July.Google Scholar
  18. R Development Core Team. 2005. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
  19. Robert, C. P. 1994. The Bayesian choice. New York, NY: Springer.CrossRefGoogle Scholar
  20. Scholkopf, B., and A. J. Smola. 2002. Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press.Google Scholar
  21. Shafer, G. 1976. A mathematical theory of evidence. Princeton, NJ: Princeton University Press.zbMATHGoogle Scholar
  22. Smola, A. J., and B. Scholkopf. 2004. A tutorial on support vector regression. Stat. Comput., 14, 199–222.MathSciNetCrossRefGoogle Scholar
  23. Tikhonov, A. N., and V. Y. Arsenin. 1977. Solutions of ill-posed problems. Washington, DC: W. H. Winston.zbMATHGoogle Scholar
  24. Troffaes, M. C. M. 2007. Decision making under uncertainty using imprecise probabilities. Int. J. Approx. Reasoning, 45, 17–29.MathSciNetCrossRefGoogle Scholar
  25. Utkin, L. V. 2010. Regression analysis using the imprecise Bayesian normal model. Int. J. Data Anal. Techniques Strategies, 2, 356–372.CrossRefGoogle Scholar
  26. Utkin, L. V., and T. Augustin. 2005. Efficient algorithms for decision making under partial prior information and general ambiguity attitudes. In Proceedings of the 4th International Symposium on Imprecise Probabilities and Their Applications, ISIPTA’05, ed. T. Seidenfeld, F. G. Cozman, and R. Nau, 349–358, Pittsburgh, PA: Carnegie Mellon University, SIPTA, July.Google Scholar
  27. Utkin, L. V., and F. P. A. Coolen. 2011. On reliability growth models using Kolmogorov-Smirnov bounds. Int. J. Performability Eng., 7, 5–19.Google Scholar
  28. Vapnik, V. 1998. Statistical learning theory. New York, NY: Wiley.zbMATHGoogle Scholar
  29. Walley, P. 1991. Statistical reasoning with imprecise probabilities. London, UK: Chapman and Hall.CrossRefGoogle Scholar
  30. Walley, P. 1996. Inferences from multinomial data: Learning about a bag of marbles (with discussion). J. R. Stat. Soc. Ser. B 58, 3–57.MathSciNetzbMATHGoogle Scholar
  31. Walley, P. 1996. Measures of uncertainty in expert systems. Artif. Intelligence, 83, 1–58.MathSciNetCrossRefGoogle Scholar
  32. Walter, G., and T. Augustin, and A. Peters. 2007. Linear regression analysis under sets of conjugate priors. In Proceedings of the Fifth International Symposium on Imprecise Probabilities and Their Applications, ed. G. de Cooman, J. Vejnarova, and M. Zaffalon, 445–455. Prague, Czech Republic.Google Scholar
  33. Wasserman, L. 2006. All of nonparametric statistics. New York, NY: Springer.zbMATHGoogle Scholar
  34. Webb, A. R. 2002. Statistical pattern recognition, 2nd ed. New York, NY: Wiley.CrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing 2014

Authors and Affiliations

  1. 1.Department of Industrial Control and AutomationSt. Petersburg State Forest Technical UniversitySt. PetersburgRussian Federation
  2. 2.Department of Mathematical SciencesDurham UniversityDurhamUK

Personalised recommendations