Multi-class Open Set Recognition Using Probability of Inclusion

  • Lalit P. Jain
  • Walter J. Scheirer
  • Terrance E. Boult
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8691)


The perceived success of recent visual recognition approaches has largely been derived from their performance on classification tasks, where all possible classes are known at training time. But what about open set problems, where unknown classes appear at test time? Intuitively, if we could accurately model just the positive data for any known class without overfitting, we could reject the large set of unknown classes even under an assumption of incomplete class knowledge. In this paper, we formulate the problem as one of modeling positive training data at the decision boundary, where we can invoke the statistical extreme value theory. A new algorithm called the P I -SVM is introduced for estimating the unnormalized posterior probability of class inclusion.


Support Vector Machine Positive Class Support Vector Data Description Extreme Value Theory Class Inclusion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

978-3-319-10578-9_26_MOESM1_ESM.pdf (1.2 mb)
Electronic Supplementary Material (PDF 1,255 KB)


  1. 1.
    Bartlett, P.L., Tewari, A.: Sparseness vs estimating conditional probabilities: Some asymptotic results. Journal of Machine Learning Research 8, 775–790 (2007)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Bartlett, P.L., Wegkamp, M.H.: Classification with a reject option using a hinge loss. Journal of Machine Learning Research 9, 1823–1840 (2008)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Behmo, R., Marcombes, P., Dalalyan, A., Prinet, V.: Towards optimal naive bayes nearest neighbor. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 171–184. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Bergamo, A., Torresani, L.: Meta-class features for large-scale object categorization on a budget. In: IEEE CVPR, pp. 3085–3092 (2012)Google Scholar
  5. 5.
    Bodesheim, P., Freytag, A., Rodner, E., Kemmler, M., Denzler, J.: Kernel null space methods for novelty detection. In: IEEE CVPR, pp. 3374–3381 (2013)Google Scholar
  6. 6.
    Bravo, C., Lobato, J.L., Weber, R., L’Huillier, G.: A hybrid system for probability estimation in multiclass problems combining svms and neural networks. In: Hybrid Intelligent Systems, pp. 649–654 (2008)Google Scholar
  7. 7.
    Broadwater, J., Chellappa, R.: Adaptive Threshold Estimation Via Extreme Value Theory. IEEE Transactions on Signal Processing 58(2), 490–500 (2010)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Cevikalp, H., Triggs, B.: Efficient object detection using cascades of nearest convex model classifiers. In: IEEE CVPR, pp. 886–893 (2012)Google Scholar
  9. 9.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)Google Scholar
  10. 10.
    Chow, C.: On optimum recognition error and reject tradeoff. IEEE Transactions on Information Theory 16(1), 41–46 (1970)CrossRefzbMATHGoogle Scholar
  11. 11.
    Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng, A.Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition, pp. 440–445 (2011)Google Scholar
  12. 12.
    Coles, S.: An introduction to statistical modeling of extreme values. Springer Series in Statistics. Springer (2001)Google Scholar
  13. 13.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR, pp. 886–893 (2005)Google Scholar
  14. 14.
    Dardas, N.H., Georganas, N.D.: Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Transactions on Instrumentation and Measurement 60, 3592–3607 (2011)CrossRefGoogle Scholar
  15. 15.
    Deng, J., Berg, A.C., Li, F.F.: Hierarchical semantic indexing for large scale image retrieval. In: IEEE CVPR, pp. 785–792 (2011)Google Scholar
  16. 16.
    Ding, X., Li, Y., Belatreche, A., Maguire, L.P.: An experimental evaluation of novelty detection methods. Neurocomputing 135 (2014)Google Scholar
  17. 17.
    Duan, K.-B., Keerthi, S.S.: Which is the best multiclass SVM method? An empirical study. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 278–285. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  18. 18.
    Enzweiler, M., Gavrila, D.M.: Integrated pedestrian classification and orientation estimation. In: IEEE CVPR, pp. 982–989 (2010)Google Scholar
  19. 19.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007),
  20. 20.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)zbMATHGoogle Scholar
  21. 21.
    Fragoso, V., Sen, P., Rodriguez, S., Turk, M.: EVSAC: accelerating hypotheses generation by modeling matching scores with extreme value theory. In: IEEE ICCV, pp. 2472–2479 (2013)Google Scholar
  22. 22.
    Fragoso, V., Turk, M.: SWIGS: a swift guided sampling method. In: IEEE CVPR, pp. 2770–2777 (2013)Google Scholar
  23. 23.
    Fumera, G., Roli, F.: Support vector machines with embedded reject option. In: Lee, S.-W., Verri, A. (eds.) SVM 2002. LNCS, vol. 2388, pp. 68–82. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  24. 24.
    Grandvalet, Y., Rakotomamonjy, A., Keshet, J., Canu, S.: Support vector machines with a reject option. In: NIPS, pp. 537–544 (2008)Google Scholar
  25. 25.
    Gumbel, E.: Statistical Theory of Extreme Values and Some Practical Applications. US Govt. Printing Office (1954)Google Scholar
  26. 26.
    Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: Annals of Statistics, pp. 507–513. MIT Press (1996)Google Scholar
  27. 27.
    Hinton, G.E., Ghahramani, Z., Teh, Y.W.: Learning to parse images. In: NIPS, pp. 463–469 (1999)Google Scholar
  28. 28.
    Huang, T.K., Weng, R.C., Lin, C.J.: Generalized bradley-terry models and multi-class probability estimates. Journal of Machine Learning Research 7, 85–115 (2006)zbMATHMathSciNetGoogle Scholar
  29. 29.
    Jepson, A., Mann, R.: Qualitative probabilities for image interpretation. In: IEEE ICCV, pp. 1123–1130 (1999)Google Scholar
  30. 30.
    Kwok, J.T.Y.: Moderating the outputs of support vector machine classifiers. IEEE Transactions on Neural Networks 10(5), 1018–1031 (1999)CrossRefGoogle Scholar
  31. 31.
    ImagetNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012), (Accessed: February 18, 2014)
  32. 32.
    ImagetNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013), (Accessed: February 18, 2014)
  33. 33.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  34. 34.
    McCann, S., Lowe, D.: Local naive Bayes nearest neighbor for image classification. In: CVPR (2012)Google Scholar
  35. 35.
    Michie, D., Spiegelhalter, D.J., Taylor, C.C., Campbell, J. (eds.): Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)Google Scholar
  36. 36.
    Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: ICML, pp. 625–632 (2005)Google Scholar
  37. 37.
    Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Smola, A., Bartlett, P., Schölkopf, B. (eds.) Advances in Large Margin Classifiers. MIT Press (2000)Google Scholar
  38. 38.
    Ramanan, D., Sminchisescu, C.: Training deformable models for localization. In: IEEE CVPR, pp. 206–213 (2006)Google Scholar
  39. 39.
    Ryoo, M., Matthies, L.: First-person activity recognition: What are they doing to me? In: IEEE CVPR, pp. 2730–2737 (2013)Google Scholar
  40. 40.
    Samanta, R., LeBaron, B.: Extreme Value Theory and Fat Tails in Equity Markets. Computing in Economics and Finance 140, Society for Computational Economics (2005)Google Scholar
  41. 41.
    Scheirer, W., Jain, L., Boult, T.: Probability models for open set recognition. To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (2014)Google Scholar
  42. 42.
    Scheirer, W., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: Calibration for attribute fusion and similarity search. In: IEEE CVPR, pp. 2933–2940 (2012)Google Scholar
  43. 43.
    Scheirer, W., Rocha, A., Michaels, R., Boult, T.E.: Meta-recognition: The theory and practice of recognition score analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8), 1689–1695 (2011)CrossRefGoogle Scholar
  44. 44.
    Scheirer, W., Rocha, A., Micheals, R., Boult, T.: Robust fusion: Extreme value theory for recognition score normalization. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 481–495. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  45. 45.
    Scheirer, W., Rocha, A., Sapkota, A., Boult, T.: Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(7), 1757–1772 (2013)CrossRefGoogle Scholar
  46. 46.
    Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13, 1443–1471 (2001)CrossRefzbMATHGoogle Scholar
  47. 47.
    Shu, G., Dehghan, A., Oreifej, O., Hand, E., Shah, M.: Part-based multiple-person tracking with partial occlusion handling. In: IEEE CVPR, pp. 1815–1821 (2012)Google Scholar
  48. 48.
    Steinwart, I.: Sparseness of support vector machines–some asymptotically sharp bounds. In: NIPS, pp. 1069–1076 (2003)Google Scholar
  49. 49.
    Tax, D.M.J., Duin, R.P.W.: Support vector data description. Machine Learning 54, 45–66 (2004)CrossRefzbMATHGoogle Scholar
  50. 50.
    Tax, D.M.J.: One-class classification: Concept learning in the absence of counter-examples. Ph.D. thesis, Technische Universiteit Delft (2001)Google Scholar
  51. 51.
    Toronto, N., Morse, B.S., Ventura, D., Seppi, K.: The hough transform’s implicit Bayesian foundation. In: IEEE ICIP, pp. 377–380 (2007)Google Scholar
  52. 52.
    Torr, P.H., Szeliski, R., Anandan, P.: An integrated Bayesian approach to layer extraction from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(3), 297–303 (2001)CrossRefGoogle Scholar
  53. 53.
    Vapnik, V.N.: Statistical Learning Theory. Wiley Interscience (1998)Google Scholar
  54. 54.
    Wah, C., Belongie, S.: Attribute-based detection of unfamiliar classes with humans in the loop. In: IEEE CVPR, pp. 779–786 (2013)Google Scholar
  55. 55.
    Weston, J., Collobert, R., Sinz, F., Bottou, L., Vapnik, V.: Inference with universum. In: ICML, pp. 1009–1016 (2006)Google Scholar
  56. 56.
    Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 694–699 (2002)Google Scholar
  57. 57.
    Zhang, R., Metaxas, D.: RO-SVM: Support vector machine with reject option for image categorization. In: BMVC, pp. 1209–1218 (2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Lalit P. Jain
    • 1
  • Walter J. Scheirer
    • 1
    • 2
  • Terrance E. Boult
    • 1
    • 3
  1. 1.University of ColoradomColorado SpringsUSA
  2. 2.Harvard UniversityUSA
  3. 3.Securics, Inc.USA

Personalised recommendations