What Leads to Generalization of Object Proposals?

Wang, Rui; Mahajan, Dhruv; Ramanathan, Vignesh

doi:10.1007/978-3-030-66096-3_32

Rui Wang¹⁰,
Dhruv Mahajan¹⁰ &
Vignesh Ramanathan¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12536))

Included in the following conference series:

European Conference on Computer Vision

Abstract

Object proposal generation is often the first step in many detection models. It is lucrative to train a good proposal model, that generalizes to unseen classes. Motivated by this, we study how a detection model trained on a small set of source classes can provide proposals that generalize to unseen classes. We systematically study the properties of the dataset – visual diversity and label space granularity – required for good generalization. We show the trade-off between using fine-grained labels and coarse labels. We introduce the idea of prototypical classes: a set of sufficient and necessary classes required to train a detection model to obtain generalized proposals in a more data-efficient way. On the Open Images V4 dataset, we show that only \(25\%\) of the classes can be selected to form such a prototypical set. The resulting proposals from a model trained with these classes is only \(4.3\%\) worse than using all the classes, in terms of average recall (AR). We also demonstrate that Faster R-CNN model leads to better generalization of proposals compared to a single-stage network like RetinaNet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We chose [25] due to its simplicity. In practice, we can use other weakly supervised approaches too.

References

Arun, A., Jawahar, C., Kumar, M.P.: Dissimilarity coefficient based weakly supervised object detection. In: CVPR (2019)
Google Scholar
Chavali, N., Agrawal, H., Mahendru, A., Batra, D.: Object-proposal evaluation protocol is ‘gameable’. In: CVPR (2016)
Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NeurIPS (2016)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2009)
Article Google Scholar
Gao, J., Wang, J., Dai, S., Li, L.J., Nevatia, R.: NOTE-RCNN: noise tolerant ensemble RCNN for semi-supervised object detection. In: ICCV (2019)
Google Scholar
Gao, Y., et al.: C-MIDN: coupled multiple instance detection network with segmentation guidance for weakly supervised object detection. In: ICCV (2019)
Google Scholar
Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018)
Google Scholar
Guillaumin, M., Ferrari, V.: Large-scale knowledge transfer for object localization in ImageNet. In: CVPR (2012)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)
Google Scholar
Hoffman, J., et al.: LSDA: large scale detection through adaptation. In: NeurIPS (2014)
Google Scholar
Hoffman, J., et al.: Large scale visual recognition through adaptation using joint representation and multiple instance learning. J. Mach. Learn. Res. 17, 4954–4984 (2016)
MathSciNet Google Scholar
Hosang, J., Benenson, R., Dollár, P., Schiele, B.: What makes for effective detection proposals? IEEE Trans. Pattern Anal. Mach. Intell. 38, 814–830 (2015)
Article Google Scholar
Krähenbühl, P., Koltun, V.: Geodesic object proposals. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 725–739. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_47
Chapter Google Scholar
Kuo, W., Hariharan, B., Malik, J.: DeepBox: learning objectness with convolutional networks. In: ICCV (2015)
Google Scholar
Kuznetsova, A., et al.: The open images dataset v4: unified image classification, object detection, and visual relationship detection at scale. arXiv preprint arXiv:1811.00982 (2018)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Novotny, D., Larlus, D., Vedaldi, A.: I have seen enough: transferring parts across categories. In: BMVC (2016)
Google Scholar
Ott, P., Everingham, M.: Shared parts for deformable part-based models. In: CVPR (2011)
Google Scholar
Pinheiro, P.O., Collobert, R., Dollár, P.: Learning to segment object candidates. In: NeurIPS (2015)
Google Scholar
Pont-Tuset, J., Arbelaez, P., Barron, J.T., Marques, F., Malik, J.: Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 128–140 (2016)
Article Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)
Google Scholar
Rochan, M., Wang, Y.: Weakly supervised localization of novel objects using appearance transfer. In: CVPR (2015)
Google Scholar
Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: CVPR (2011)
Google Scholar
Singh, B., Li, H., Sharma, A., Davis, L.S.: R-FCN-3000 at 30fps: decoupling detection and classification. In: CVPR (2018)
Google Scholar
Szegedy, C., Reed, S., Erhan, D., Anguelov, D., Ioffe, S.: Scalable, high-quality object detection. arXiv preprint arXiv:1412.1441 (2014)
Tang, P., et al.: PCL: proposal cluster learning for weakly supervised object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42, 176–191 (2018)
Article Google Scholar
Tang, Y., Wang, J., Gao, B., Dellandréa, E., Gaizauskas, R., Chen, L.: Large scale semi-supervised object detection using visual and semantic knowledge transfer. In: CVPR (2016)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T., et al.: Sharing features: efficient boosting procedures for multiclass object detection. In: CVPR (2004)
Google Scholar
Uijlings, J., Popov, S., Ferrari, V.: Revisiting knowledge transfer for training object class detectors. In: CVPR (2018)
Google Scholar
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154–171 (2013)
Article Google Scholar
Yang, H., Wu, H., Chen, H.: Detecting 11K classes: large scale object detection without fine-grained bounding boxes. arXiv preprint arXiv:1908.05217 (2019)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_26
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Facebook AI, Menlo Park, USA
Rui Wang, Dhruv Mahajan & Vignesh Ramanathan

Authors

Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dhruv Mahajan
View author publications
You can also search for this author in PubMed Google Scholar
Vignesh Ramanathan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Wang .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 133 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, R., Mahajan, D., Ramanathan, V. (2020). What Leads to Generalization of Object Proposals?. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12536. Springer, Cham. https://doi.org/10.1007/978-3-030-66096-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-66096-3_32
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66095-6
Online ISBN: 978-3-030-66096-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics