A novel feature-based model for zero-shot object detection with simulated attributes

Yang, Cheng; Wu, Weijia; Wang, Yuxing; Zhou, Hong

doi:10.1007/s10489-021-02746-z

A novel feature-based model for zero-shot object detection with simulated attributes

Published: 18 September 2021

Volume 52, pages 6905–6914, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Cheng Yang ORCID: orcid.org/0000-0002-6009-4227¹,
Weijia Wu¹,
Yuxing Wang¹ &
…
Hong Zhou¹

794 Accesses
7 Citations
Explore all metrics

Abstract

Zero-shot object detection (ZSD) has recently been proposed for detecting objects whose categories have never been seen during training. Existing ZSD works have some drawbacks: (a) the end-to-end methods sacrifice the mean accuracy precision (mAP) on seen classes; (b) the feature-based methods could avoid the above problem but suffer from simple feature construction. Thus, in this paper, we present a succinct but effective feature-based ZSD model whose feature construction naturally leverages the deep feature embedding of the detector itself as the visual features of the detected objects. The features we utilize, named “Detection Feature” (DetFeat), contain not only visual representations but also context and position information, which provide more discriminative information for seen and unseen objects. Additionally, we simulate the construction of the attributes defined by human experts to generate the specific label embedding for the ZSD task, named “Simulated Attributes” (Simu-Attr). We find that Simu-attr promotes better alignment between visual and semantic space for alleviating the problem of the semantic gap. Extensive experiments show that our approach improves the detection performance on unseen classes while maintaining the high detection performance on seen classes. On the challenging COCO dataset, we surpass the best existing transductive ZSD TL-ZSD with about 1% on unseen class and about 10% on seen class using mAP as metric.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Zero-shot object detection with contrastive semantic association network

Article 09 November 2023

SKZC: self-distillation and k-nearest neighbor-based zero-shot classification

Article Open access 22 April 2024

Transformer-Based Zero-Shot Detection via Contrastive Learning

References

Bansal A, Sikka K, Sharma G, Chellappa R, Divakaran A (2018) Zero-shot object detection. In: Proceedings of the european conference on computer vision (ECCV), pp 384–400
Chen H, Luo Y, Cao L, Zhang B, Guo G, Wang C, Li J, Ji R (2019) Generalized zero-shot vehicle detection in remote sensing imagery via coarse-to-fine framework. In: Proceedings of the 28th International joint conference on artificial intelligence. AAAI Press, pp 687–693
Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, Zhang Z, Cheng D, Zhu C, Cheng T, Zhao Q, Li B, Lu X, Zhu R, Wu Y, Dai J, Wang J, Shi J, Ouyang W, Loy CC, Lin D (2019) MMDetection: Open mmlab detection toolbox and benchmark. arXiv:1906.07155
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764– 773
Demirel B, Cinbis RG, Ikizler-Cinbis N (2018) Zero-shot object detection by hybrid region embedding
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato M, Mikolov T (2013) Devise: A deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440– 1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Li Z, Yao L, Zhang X, Wang X, Kanhere S, Zhang H (2019) Zero-shot object detection with textual descriptions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 8690–8697
Lin TY, Dollár P., Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado GS, Dean J (2013) Zero-shot learning by convex combination of semantic embeddings. arXiv:1312.5650
Palatucci M, Pomerleau D, Hinton GE, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Advances in neural information processing systems, pp 1410–1418
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Rahman S, Khan S, Barnes N (2018) Polarity loss for zero-shot object detection. arXiv:1811.08982
Rahman S, Khan S, Barnes N (2019) Transductive learning for zero-shot object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6082–6091
Rahman S, Khan S, Porikli F (2018) Zero-shot object detection: Learning to simultaneously recognize and localize novel concepts. In: Asian conference on computer vision. Springer, pp 547–563
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Ren S, He K, Girshick R, Zhang X, Sun J (2016) Object detection networks on convolutional feature maps. IEEE Trans Pattern Anal Mach Intell 39(7):1476–1481
Article Google Scholar
Zhang L, Wang X, Yao L, Wu L, Zheng F (2020) Zero-shot object detection via learning an embedding from semantic space to visual space. In: Twenty-ninth international joint conference on artificial intelligence and seventeenth pacific rim international conference on artificial intelligence {IJCAI-PRICAI-20}. international joint conferences on artificial intelligence organization
Zhu P, Wang H, Saligrama V (2019) Zero shot detection. IEEE Transactions on Circuits and Systems for Video Technology
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European conference on computer vision. Springer, pp 391–405

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China under Grant 2019YFC0118200, National Natural Science Foundation of China under Grant 6180332.

Author information

Authors and Affiliations

Key Laboratory for Biomedical Engineering of Ministry, Zhejiang University, Hangzhou, 310027, China
Cheng Yang, Weijia Wu, Yuxing Wang & Hong Zhou

Authors

Cheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weijia Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yuxing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yuxing Wang or Hong Zhou.

Ethics declarations

Conflict of Interests

We have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, C., Wu, W., Wang, Y. et al. A novel feature-based model for zero-shot object detection with simulated attributes. Appl Intell 52, 6905–6914 (2022). https://doi.org/10.1007/s10489-021-02746-z

Download citation

Accepted: 04 August 2021
Published: 18 September 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10489-021-02746-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel feature-based model for zero-shot object detection with simulated attributes

Abstract

Access this article

Similar content being viewed by others

Zero-shot object detection with contrastive semantic association network

SKZC: self-distillation and k-nearest neighbor-based zero-shot classification

Transformer-Based Zero-Shot Detection via Contrastive Learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel feature-based model for zero-shot object detection with simulated attributes

Abstract

Access this article

Similar content being viewed by others

Zero-shot object detection with contrastive semantic association network

SKZC: self-distillation and k-nearest neighbor-based zero-shot classification

Transformer-Based Zero-Shot Detection via Contrastive Learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation