A framework for imprecise robust one-class classification models

Original Article
  • 172 Downloads

Abstract

A framework for constructing robust one-class classification models is proposed in the paper. It is based on Walley’s imprecise extensions of contaminated models which produce a set of probability distributions of data points instead of a single empirical distribution. The minimax and minimin strategies are used to choose an optimal probability distribution from the set and to construct optimal separating functions. It is shown that an algorithm for computing optimal parameters is determined by extreme points of the probability set and is reduced to a finite number of standard SVM tasks with weighted data points. Important special cases of the models, including pari-mutuel, constant odd-ratio, contaminated models and Kolmogorov–Smirnov bounds are studied. Experimental results with synthetic and real data illustrate the proposed models.

Keywords

Machine learning Novelty detection Classification Minimax strategy Support vector machine Quadratic programming 

References

  1. 1.
    Bartkowiak A (2011) Anomaly, novelty, one-class classification: a comprehensive introduction. Int J Comput Inf Syst Ind Manage Appl 3:61–71Google Scholar
  2. 2.
    Ben-Tal A, Ghaoui L, Nemirovski A (2009) Robust optimization. Princeton University Press, PrincetonGoogle Scholar
  3. 3.
    Berger J (1985) Statistical decision theory and Bayesian analysis. Springer, New YorkGoogle Scholar
  4. 4.
    Bi J, Zhang T (2004) Support vector classification with input data uncertainty. In: Saul L, Weiss Y, Bottou L (eds) Advances in neural information processing systems, vol 17. MIT Press, Cambridge, pp 161–168Google Scholar
  5. 5.
    Bicego M, Figueiredo M (2009) Soft clustering using weighted one-class support vector machines. Pattern Recogn Lett 42:27–32CrossRefMATHGoogle Scholar
  6. 6.
    Bouveyron C, Girard S (2009) Robust supervised classification with mixture models: learning from data with uncertain labels. Pattern Recogn Lett 42:2649–2658CrossRefMATHGoogle Scholar
  7. 7.
    Campbell C (2002) Kernel methods: a survey of current techniques. Neurocomputing 48(1–4):63–84CrossRefMATHGoogle Scholar
  8. 8.
    Campbell C, Bennett, K.: (2001) A linear programming approach to novelty detection. In: Leen T, Dietterich T, Tresp V (eds) Advances in neural information processing systems, vol13. MIT Press, pp 395–401Google Scholar
  9. 9.
    Cerioli A, Riani M, Atkinson A (2006) Robust classification with categorical variables. In: Rizzi A, Vichi M (eds) Compstat 2006: Proceedings in computational statistics. Physica-Verlag HD, pp 507–519Google Scholar
  10. 10.
    Chandola V, Banerjee A, Kumar V (2007) Anomaly detection: a survey. Technical report TR 07-017, University of Minnesota, MinneapolisGoogle Scholar
  11. 11.
    Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41:1–58CrossRefGoogle Scholar
  12. 12.
    Cherkassky V, Mulier F (2007) Learning from data: concepts, theory, and methods. Wiley-IEEE Press, UKGoogle Scholar
  13. 13.
    Evgeniou T, Poggio T, Pontil M, Verri A (2002) Regularization and statistical learning theory for data analysis. Comput Stat Data Anal 38(4):421–432CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Frank A, Asuncion A (2010) UCI machine learning repository. http://www.archive.ics.uci.edu/ml
  15. 15.
    Ghaoui L, Lanckriet G, Natsoulis G (2003) Robust classification with interval data. Technical report no. UCB/CSD-03-1279, University of California, Berkeley, California 94720Google Scholar
  16. 16.
    Gilboa I, Schmeidler D (1989) Maxmin expected utility with non-unique prior. J Math Econ 18(2):141–153CrossRefMATHMathSciNetGoogle Scholar
  17. 17.
    Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRefMATHGoogle Scholar
  18. 18.
    Huber P (1981) Robust statistics. Wiley, New YorkGoogle Scholar
  19. 19.
    Johnson N, Leone F (1964) Statistics and experimental design in engineering and the physical sciences, vol1. Wiley, New YorkGoogle Scholar
  20. 20.
    Khan S, Madden M (2010) A survey of recent trends in one class classification. In: Coyle L, Freyne J (eds) Artificial intelligence and cognitive science. Lecture notes in computer science, vol 6206. Springer, Berlin-Heidelberg, pp 188–197Google Scholar
  21. 21.
    Kwok J, Tsang IH, Zurada J (2007) A class of single-class minimax probability machines for novelty detection. IEEE Trans Neural Networks 18(3):778–785CrossRefGoogle Scholar
  22. 22.
    Lanckriet G, Ghaoui L, Bhattacharyya C, Jordan M (2002) A robust minimax approach to classification. J Mach Learn Res 3:555–582Google Scholar
  23. 23.
    Lanckriet G, Ghaoui L, Jordan M (2003) Robust novelty detection with single-class mpm. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems, vol15. MIT Press, Cambridge, pp 905–912Google Scholar
  24. 24.
    Liu Z, Wu Q, Zhang Y, Chen CLP (2011) Adaptive least squares support vector machines filter for hand tremor canceling in microsurgery. Int J Mach Learn Cybern 2(1):37–47CrossRefMathSciNetGoogle Scholar
  25. 25.
    Markou M, Singh S (2003) Novelty detection: a review. Part 1: statistical approaches. Signal Process 83(12):2481–2497CrossRefMATHGoogle Scholar
  26. 26.
    Maulik U, Chakraborty D (2012) A novel semisupervised SVM for pixel classification of remote sensing imagery. Int J Mach Learn Cybern 3(3):247–258CrossRefGoogle Scholar
  27. 27.
    Musa AB (2012) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern. doi:10.1007/s13042-012-0068-x
  28. 28.
    Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231CrossRefMATHGoogle Scholar
  29. 29.
    Robert C (1994) The Bayesian choice. Springer, New YorkGoogle Scholar
  30. 30.
    Scholkopf B, Platt J, Shawe-Taylor J, Smola A, Williamson R (2001) Estimating the support of a high-dimensional distribution. Neural Comput Appl 13(7):1443–1471CrossRefGoogle Scholar
  31. 31.
    Scholkopf B, Smola A (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, CambridgeGoogle Scholar
  32. 32.
    Scholkopf B, Williamson R, Smola A, Shawe-Taylor J, Platt J (2000) Support vector method for novelty detection. In: Advances in neural information processing systems, pp 526–532Google Scholar
  33. 33.
    Steinwart I, Hush D, Scovel C (2005) A classification framework for anomaly detection. J Mach Learn Res 6:211–232MATHMathSciNetGoogle Scholar
  34. 34.
    Tax D, Duin R (1999) Support vector domain description. Pattern Recogn Lett 20:1191–1199CrossRefGoogle Scholar
  35. 35.
    Tax D, Duin R (2004) Support vector data description. Mach Learn 54:45–66CrossRefMATHGoogle Scholar
  36. 36.
    Tikhonov A, Arsenin V (1977) Solution of ill-posed problems. W.H. Winston, Washington DCGoogle Scholar
  37. 37.
    Trafalis T, Gilbert R (2007) Robust support vector machines for classification and computational issues. Optim Methods Softw 22(1):187–198CrossRefMATHMathSciNetGoogle Scholar
  38. 38.
    Troffaes M (2007) Decision making under uncertainty using imprecise probabilities. Int J Approx Reason 45(1):17–29CrossRefMATHMathSciNetGoogle Scholar
  39. 39.
    Utkin L, Coolen F (2011) On reliability growth models using Kolmogorov-Smirnov bounds. Int J Perform Eng 7(1):5–19Google Scholar
  40. 40.
    Vapnik V (1998) Statistical learning theory. Wiley, New YorkGoogle Scholar
  41. 41.
    Walley P (1991) Statistical reasoning with imprecise probabilities. Chapman and Hall, LondonGoogle Scholar
  42. 42.
    Wang J, Lu H, Plataniotis K, Lu J (2009) Gaussian kernel optimization for pattern classification. Pattern Recogn Lett 42(7):1237–1247CrossRefMATHGoogle Scholar
  43. 43.
    Wasserman L (2006) All of nonparametric statistics. Springer, New YorkGoogle Scholar
  44. 44.
    Xiao J-Z, Wang H-R, Yang X-C, Gao Z (2012) Multiple faults diagnosis in motion system based on SVM. Int J Mach Learn Cybern 3(1):77–82CrossRefGoogle Scholar
  45. 45.
    Xu H, Caramanis C, Mannor S (2009) Robustness and regularization of support vector machines. J Mach Learn Res 10:1485–1510MATHMathSciNetGoogle Scholar
  46. 46.
    Xu L, Crammer K, Schuurmans D (2006) Robust support vector machine training via convex outlier ablation. In: Proceedings of the 21st national conference on artificial intelligence (AAAI-06), vol 21. AAAI Press; MIT Press, Boston, pp 536–542Google Scholar
  47. 47.
    Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(5):961–976CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Department of Industrial Control and AutomationSt. Petersburg State Forest Technical UniversitySt. PetersburgRussia

Personalised recommendations