Advertisement

Pattern Recognition and Image Analysis

, Volume 28, Issue 4, pp 658–663 | Cite as

In Defense of Active Part Selection for Fine-Grained Classification

  • D. Korsch
  • J. Denzler
Proceedings of the 6th International Workshop
  • 2 Downloads

Abstract

Fine-grained classification is a recognition task where subtle differences distinguish between different classes. To tackle this classification problem, part-based classification methods are mostly used. Partbased methods learn an algorithm to detect parts of the observed object and extract local part features for the detected part regions. In this paper we show that not all extracted part features are always useful for the classification. Furthermore, given a part selection algorithm that actively selects parts for the classification we estimate the upper bound for the fine-grained recognition performance. This upper bound lies way above the current state-of-the-art recognition performances which shows the need for such an active part selection method. Though we do not present such an active part selection algorithm in this work, we propose a novel method that is required by active part selection and enables sequential part-based classification. This method uses a support vector machine (SVM) ensemble and allows to classify an image based on arbitrary number of part features. Additionally, the training time of our method does not increase with the amount of possible part features. This fact allows to extend the SVM ensemble with an active part selection component that operates on a large amount of part feature proposals without suffering from increasing training time.

Keywords

fine-grained recognition SVM ensemble bagging 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    J. Ba, V. Mnih, and K. Kavukcuoglu, “Multiple object recognition with visual attention,” CoRR, abs/1412.7755 (2014). https://arxiv.org/abs/1412.7755Google Scholar
  2. 2.
    L. Breiman. “Bagging predictors,” Mach. Learn. 24 (2), 123–140 (1996).zbMATHGoogle Scholar
  3. 3.
    J. Denzler and C. M. Brown. “Information theoretic sensor data selection for active object recognition and state estimation,” IEEE Trans. Pattern Anal. Mach. Intell. 24 (2), 145–157 (2002).Google Scholar
  4. 4.
    M. Jaderberg, K. Simonyan, A. Zisserman, et al. “Spatial transformer networks,” in Advances in Neural Information Processing Systems 28: Proc. Annual Conf. NIPS 2015 (Montreal, Canada, 2015), pp. 2017–2025.Google Scholar
  5. 5.
    J. Krause, H. Jin, J. Yang, and L. Fei–Fei, “Finegrained recognition without part annotations,” in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Boston, MA, 2015), pp. 5546–5555.CrossRefGoogle Scholar
  6. 6.
    J. Krause, B. Sapp, A. Howard, H. Zhou, A. Toshev, T. Duerig, J. Philbin, and L. Fei–Fei, “The unreasonable effectiveness of noisy data for fine–grained recognition,” in Computer Vision–ECCV 2016, Proc. 14th European Conf., Part II, Ed. by B. Leibe et al., Lecture Notes in Computer Science (Springer, Cham, 2016), Vol. 9906, pp. 301–320.Google Scholar
  7. 7.
    B. Linghu and B.–Y. Sun, “Constructing effective SVM ensembles for image classification,” in Proc. 2010 3rd International Symposium on Knowledge Acquisition and Modeling (KAM) (Wuhan, China, 2010), IEEE, pp. 80–83.Google Scholar
  8. 8.
    X. Liu, T. Xia, J. Wang, and Y. Lin, “Fully convolutional attention localization networks: Efficient attention localization for fine–grained recognition,” arXiv:1603.06765 (2016). https://arxiv.org/abs/1603.06765Google Scholar
  9. 9.
    V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent models of visual attention,” in Advances in Neural Information Processing Systems 27: Proc. Annual Conf. NIPS 2014 (Montreal, Canada, 2014), pp. 2204–2212.Google Scholar
  10. 10.
    P. Sermanet, A. Frome, and E. Real, “Attention for finegrained categorization,” arXiv:1412.7054 (2014). https://arxiv.org/abs/1412.7054Google Scholar
  11. 11.
    M. Simon and E. Rodner, “Neural activation constellations: Unsupervised part model discovery with convolutional networks,” in Proc. 2015 IEEE Int. Conf. on Computer Vision (ICCV) (Santiago, Chile, 2015), pp. 1143–1151.Google Scholar
  12. 12.
    M. Simon, E. Rodner, and J. Denzler, “Part detector discovery in deep convolutional neural networks,” in Computer Vision–ACCV 2014, Proc. 12th Asian Conference on Computer Vision, Ed. by D. Cremers et al., Lecture Notes in Computer Science (Springer, Cham, 2014), Vol. 9004, pp. 162–177.Google Scholar
  13. 13.
    K. Simonyan and A. Zisserman. “Very deep convolutional networks for large–scale image recognition,” CoRR, abs/1409.1556 (2014). https://arxiv.org/abs/1409.1556Google Scholar
  14. 14.
    C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, The Caltech–UCSD Birds–200–2011 Dataset, Technical Report CNS–TR–2011–001 (California Institute of Technology, 2011).Google Scholar
  15. 15.
    S.–J. Wang, A. Mathew, Y. Chen, L.–F. Xi, L. Ma, and J. Lee, “Empirical analysis of support vector machine ensemble classifiers,” Expert Syst. Appl. 36 (3), Part 2, 6466–6476 (2009).CrossRefGoogle Scholar
  16. 16.
    H. Zheng, J. Fu, T. Mei, and J. Luo, “Learning multiattention convolutional neural network for fine–grained image recognition,” in Proc. 2017 IEEE Int. Conf. on Computer Vision (ICCV) (Venice, Italy, 2017), pp. 5219–5227.CrossRefGoogle Scholar

Copyright information

© Pleiades Publishing, Ltd. 2018

Authors and Affiliations

  1. 1.Computer Vision GroupFriedrich Schiller University JenaJenaGermany

Personalised recommendations