Searching sharing relationship for instance segmentation decoder

Xi, Yuling; Wang, Ning; Wan, Shaohua; Wang, Xiaoming; Wang, Peng; Zhang, Yanning

doi:10.1007/s10489-022-04434-y

Searching sharing relationship for instance segmentation decoder

Published: 02 May 2023

Volume 53, pages 20938–20949, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yuling Xi¹,
Ning Wang¹,
Shaohua Wan ORCID: orcid.org/0000-0001-7013-9081²,
Xiaoming Wang¹,
Peng Wang¹ &
…
Yanning Zhang¹

340 Accesses
1 Altmetric
Explore all metrics

Abstract

Instance segmentation is a typical visual task that requires per-pixel mask prediction with a category label for each instance. For the decoder in instance segmentation network, parallel branches or towers are commonly adopted to deal with instance- and dense-level predictions. However, this parallelism ignores inter-branch and inner-branch relationships. Besides, how the different branches are connected is unclear, which is difficult to explore manually in practice. To address the above issues, we introduce Neural Architecture Search (NAS) to automatically search for hardware and memory-friendly feature sharing branch. Concretely, applying to instance segmentation, we design a search space considering both operations and sharing connections of parallel branches. Through a tailored reinforcement learning(RL) paradigm, we can efficiently search multiple architectures with different shared patterns and tap more feature selection possibilities. Our method is generically useful and can be transferred to analogous multi-task networks. The searched architecture shares features in the middle of the head branches and utilizes instance-level head features to generate pixel-level predictions. Extensive experiments demonstrate the effectiveness and surpass classical parallel decoder networks, exceeding BlendMask by 1.2% on bounding box mAP and 0.9% on segmentation mAP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Li Y, Qi H, Dai J, Ji X, Wei Y (2017) Fully convolutional instance-aware semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2359–2367
Bolya D, Zhou C, Xiao F, Lee YJ (2019) Yolact: Real-time instance segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 9157–9166
Chen H, Sun K, Tian Z, Shen C, Huang Y, Yan Y (2020) BlendMask: Top-down meets bottom-up for instance segmentation. In: Proc. IEEE Conf. computer vision and pattern recognition (CVPR)
Neven D, Brabandere BD, Proesmans M, Gool LV (2019) Instance segmentation by jointly optimizing spatial embeddings and clustering bandwidth. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8837–8845
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence 39(6):1137–1149
Article Google Scholar
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Wang S, Gong Y, Xing J, Huang L, Huang C, Hu W (2020) Rdsnet: a new deep architecture forreciprocal object detection and instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12208–12215
Wang Y, Xu Z, Shen H, Cheng B, Yang L (2020) Centermask: single shot instance segmentation with point representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9313–9321
Chen B, Ghiasi G, Liu H, Lin T-Y, Kalenichenko D, Adam H, Le QV (2020) Mnasfpn: Learning latency-aware pyramid architecture for object detection on mobile devices. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13607–13616
Zoph B, Le Q (2017) Neural architecture search with reinforcement learning. In: International conference on learning representations, https://openreview.net/forum?id=r1Ue8Hcxg
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710
Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. In: 7th International conference on learning representations
Xu Y, Xie L, Zhang X, Chen X, Qi G, Tian Q, Xiong H (2020) PC-DARTS: partial channel connections for memory-efficient architecture search. In: 8th International conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, OpenReview.net
Tan M, Le Q (2019) vEfficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp 6105–6114
Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7036–7045
Wang N, Gao Y, Chen H, Wang P, Tian Z, Shen C, Zhang Y (2020) Nas-fcos: fast neural architecture search for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Xu H, Yao L, Zhang W, Liang X, Li Z (2019) Auto-FPN: automatic network architecture adaptation for object detection beyond classification, pp 6649–6658
Li C, Yuan X, Lin C, Guo M, Wu W, Yan J, Ouyang W (2019) Am-lfs: Automl for loss function search. In: Proceedings of the IEEE international conference on computer vision, pp 8410–8419
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Adv Neural Inf Process Syst 31:527–538
Google Scholar
Wu Y, He K (2018) Group normalization. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need, Advances in neural information processing systems, vol 30
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv:1707.06347
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
Liu Z, Liew JH, Chen X, Feng J (2021) Dance: a deep attentive contour model for efficient instance segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV), pp 345–354
Tian Z, Shen C, Chen H (2020) Conditional convolutions for instance segmentation. In: Proc. Eur. Conf. Computer Vision (ECCV)
Wang X, Zhang R, Shen C, Kong T, Li L (2021) Solo A simple framework for instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence
Jie F, Nie Q, Li M, Yin M, Jin T (2021) Atrous spatial pyramid convolution for object detection with encoder-decoder. Neurocomputing 464:107–118
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. U19B2037), National Key R&D Program of China (No. 2020AAA0106900), Shaanxi Provincial Key R&D Program (No. 2021KWZ-03), and Natural Science Basic Research Program of Shaanxi (No. 2021JCW-03).

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Yuling Xi, Ning Wang, Xiaoming Wang, Peng Wang & Yanning Zhang
Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, China
Shaohua Wan

Authors

Yuling Xi
View author publications
You can also search for this author in PubMed Google Scholar
Ning Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shaohua Wan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yanning Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanning Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xi, Y., Wang, N., Wan, S. et al. Searching sharing relationship for instance segmentation decoder. Appl Intell 53, 20938–20949 (2023). https://doi.org/10.1007/s10489-022-04434-y

Download citation

Accepted: 26 December 2022
Published: 02 May 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10489-022-04434-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Searching sharing relationship for instance segmentation decoder

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Searching sharing relationship for instance segmentation decoder

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation