Advertisement

Solving Long-Tailed Recognition with Deep Realistic Taxonomic Classifier

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12353)

Abstract

Long-tail recognition tackles the natural non-uniformly distributed data in real-world scenarios. While modern classifiers perform well on populated classes, its performance degrades significantly on tail classes. Humans, however, are less affected by this since, when confronted with uncertain examples, they simply opt to provide coarser predictions. Motivated by this, a deep realistic taxonomic classifier (Deep-RTC) is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions. The model has the option to reject classifying samples at different levels of the taxonomy, once it cannot guarantee the desired performance. Deep-RTC is implemented with a stochastic tree sampling during training to simulate all possible classification conditions at finer or coarser levels and a rejection mechanism at inference time. Experiments on the long-tailed version of four datasets, CIFAR100, AWA2, Imagenet, and iNaturalist, demonstrate that the proposed approach preserves more information on all classes with different popularity levels. Deep-RTC also outperforms the state-of-the-art methods in longtailed recognition, hierarchical classification, and learning with rejection literature using the proposed correctly predicted bits (CPB) metric.

Keywords

Realistic predictor Taxonomic classifier Long-tail recognition 

Notes

Acknowledgments

This work was partially funded by NSF awards IIS-1637941, IIS-1924937, and NVIDIA GPU donations.

Supplementary material

504445_1_En_11_MOESM1_ESM.pdf (249 kb)
Supplementary material 1 (pdf 248 KB)

References

  1. 1.
    iNaturalist 2018 Competition. https://github.com/visipedia/inat_comp
  2. 2.
    Ahmed, K., Baig, M.H., Torresani, L.: Network of experts for large-scale image categorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 516–532. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_32CrossRefGoogle Scholar
  3. 3.
    Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  4. 4.
    Anaki, D., Bentin, S.: Familiarity effects on categorization levels of faces and objects. Cognition 111, 144–149 (2009)CrossRefGoogle Scholar
  5. 5.
    Anderson, J.: The adaptive nature of human categorization. Psychol. Rev. 98, 409–429 (1991)CrossRefGoogle Scholar
  6. 6.
    Buda, M., Maki, A., Mazurowski, M.: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. 106 (2017).  https://doi.org/10.1016/j.neunet.2018.07.011
  7. 7.
    Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in Neural Information Processing Systems (NIPS) (2019)Google Scholar
  8. 8.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002). http://dl.acm.org/citation.cfm?id=1622407.1622416zbMATHGoogle Scholar
  9. 9.
    Chow, C.K.: An optimum character recognition system using decision functions. IRE Trans. Electron. Comput. EC-6, 247–254 (1957)Google Scholar
  10. 10.
    Chow, C.K.: On optimum recognition error and reject tradeoff. IEEE Trans. Inf. Theory 16, 41–46 (1970)CrossRefGoogle Scholar
  11. 11.
    Corbiére, C., Thome, N., Bar-Hen, A., Cord, M., Pérez, P.: Addressing failure detection by learning model confidence. In: Advances in Neural Information Processing Systems (NIPS) (2019)Google Scholar
  12. 12.
    Cortes, C., DeSalvo, G., Mohri, M.: Boosting with abstention. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  13. 13.
    Cortes, C., DeSalvo, G., Mohri, M.: Learning with rejection. In: Ortner, R., Simon, H.U., Zilles, S. (eds.) ALT 2016. LNCS (LNAI), vol. 9925, pp. 67–82. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46379-7_5CrossRefzbMATHGoogle Scholar
  14. 14.
    Cui, Y., Jia, M., Lin, T.Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  15. 15.
    Davis, J., Liang, T., Enouen, J., Ilin, R.: Hierarchical semantic labeling with adaptive confidence. In: International Symposium on Visual Computing (2019)Google Scholar
  16. 16.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  17. 17.
    Deng, J., et al.: Large-scale object classification using label relation graphs. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 48–64. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_4CrossRefGoogle Scholar
  18. 18.
    Deng, J., Krause, J., Berg, A.C., Fei-Fei, L.: Hedging your bets: optimizing accuracy-specificity trade-offs in large scale visual recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  19. 19.
    Dong, Q., Gong, S., Zhu, X.: Class rectification hard mining for imbalanced deep learning. In: International Conference on Computer Vision (ICCV) (10 2017)Google Scholar
  20. 20.
    Drummond, C., Holte, R.: C4.5, class imbalance, and cost sensitivity: why under-sampling beats oversampling. Proceedings of the ICML 2003 Workshop on Learning from Imbalanced Datasets (2003)Google Scholar
  21. 21.
    El-Yaniv, R., Wiener, Y.: On the foundations of noise-free selective classification. J. Mach. Learn. Res. 11, 1605–1641 (2010)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Fumera, G., Roli, F.: Support vector machines with embedded reject option. In: Lee, S.-W., Verri, A. (eds.) SVM 2002. LNCS, vol. 2388, pp. 68–82. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45665-1_6CrossRefzbMATHGoogle Scholar
  23. 23.
    Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning (ICML) (2016)Google Scholar
  24. 24.
    Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  25. 25.
    Geifman, Y., El-Yaniv, R.: SelectiveNet: a deep neural network with an integrated reject option. In: International Conference on Machine Learning (ICML) (2019)Google Scholar
  26. 26.
    Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  27. 27.
    Goo, W., Kim, J., Kim, G., Hwang, S.J.: Taxonomy-regularized semantic deep convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 86–101. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_6CrossRefGoogle Scholar
  28. 28.
    Guo, Y., Liu, Y., Bakker, E.M., Guo, Y., Lew, M.S.: CNN-RNN: a large-scale hierarchical image classification framework. Multimedia Tools Appl. 77, 10251–10271 (2018)CrossRefGoogle Scholar
  29. 29.
    He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328 (2008)Google Scholar
  30. 30.
    Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005).  https://doi.org/10.1007/11538059_91CrossRefGoogle Scholar
  31. 31.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009).  https://doi.org/10.1109/TKDE.2008.239CrossRefGoogle Scholar
  32. 32.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  33. 33.
    Horn, G.V., et al.: The iNaturalist species classification and detection dataset. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  34. 34.
    Huang, C., Li, Y., Loy, C.C., Tang, X.: Learning deep representation for imbalanced classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  35. 35.
    Jiang, H., Kim, B., Guan, M., Gupta, M.: To trust or not to trust a classifier. In: Advances in Neural Information Processing Systems (NIPS), pp. 5541–5552 (2018)Google Scholar
  36. 36.
    Johnson, K.: Impact of varying levels of expertise on decisions of category typicality. Memory Cogn. 29, 1036–1050 (2001)CrossRefGoogle Scholar
  37. 37.
    Johnson, K., Mervis, C.: Effects of varying levels of expertise on the basic level of categorization. J. Exp. Psychol. Gen. 126(3), 248–77 (1997)CrossRefGoogle Scholar
  38. 38.
    Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (ICLR) (2020)Google Scholar
  39. 39.
    Kim, H.J., Frahm, J.-M.: Hierarchy of alternating specialists for scene recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 471–488. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_28CrossRefGoogle Scholar
  40. 40.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  41. 41.
    Krizhevsky, A., Hinton, G.: Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2251–2265 (2019)CrossRefGoogle Scholar
  42. 42.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS) (2012)Google Scholar
  43. 43.
    Lee, K., Lee, K., Min, K., Zhang, Y., Shin, J., Lee, H.: Hierarchical novelty detection for visual object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  44. 44.
    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 1, 1 (2018)Google Scholar
  45. 45.
    Liu, Y., Dou, Y., Jin, R., Qiao, P.: Visual tree convolutional neural network in image classification. In: International Conference on Pattern Recognition (ICPR) (2018)Google Scholar
  46. 46.
    Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  47. 47.
    Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 185–201. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01216-8_12CrossRefGoogle Scholar
  48. 48.
    Marszałek, M., Schmid, C.: Semantic hierarchies for visual object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2007)Google Scholar
  49. 49.
    Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38, 39–41 (1995)CrossRefGoogle Scholar
  50. 50.
    Morgado, P., Vasconcelos, N.: Semantically consistent regularization for zero-shot recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  51. 51.
    Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  52. 52.
    Shahbaba, B., Neal, R.M.: Improving classification when a class hierarchy is available using a hierarchy-based prior. Bayesian Anal. 2(1), 221–238 (2007)MathSciNetCrossRefGoogle Scholar
  53. 53.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  54. 54.
    Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: IEEE Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  55. 55.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014). http://jmlr.org/papers/v15/srivastava14a.htmlMathSciNetzbMATHGoogle Scholar
  56. 56.
    Tanaka, J., Taylor, M.: Object categories and expertise: is the basic level in the eye of the beholder. Cogn. Psychol. (1991).  https://doi.org/10.1016/0010-0285(91)90016-H
  57. 57.
    Wang, P., Vasconcelos, N.: Towards realistic predictors. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 37–53. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01261-8_3CrossRefGoogle Scholar
  58. 58.
    Wang, Y.X., Hebert, M.: Learning from small sample sets by combining unsupervised meta-training with CNNs. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  59. 59.
    Wang, Y.-X., Hebert, M.: Learning to learn: model regression networks for easy small sample learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 616–634. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_37CrossRefGoogle Scholar
  60. 60.
    Wang, Y.X., Ramanan, D., Hebert, M.: Learning to model the tail. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  61. 61.
    Yan, Z., et al.: HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  62. 62.
    Zhang, X., Fang, Z., Wen, Y., Li, Z., Qiao, Y.: Range loss for deep face recognition with long-tailed training data. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar
  63. 63.
    Zhao, B., Fei-Fei, L., Xing, E.P.: Large-scale category structure aware image categorization. In: Advances in Neural Information Processing Systems (NIPS) (2011)Google Scholar
  64. 64.
    Zhu, X., Bain, M.: B-CNN: branch convolutional neural network for hierarchical classification. CoRR abs/1709.09890 (2017)Google Scholar
  65. 65.
    Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 297–313. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01219-9_18CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of CaliforniaSan DiegoUSA

Personalised recommendations