Skip to main content
Log in

TPRNet: camouflaged object detection via transformer-induced progressive refinement network

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Camouflaged object detection (COD) is a challenging task which aims to detect objects similar to the surrounding environment. In this paper, we propose a transformer-induced progressive refinement network (TPRNet) to solve challenging COD tasks. Specifically, our network includes a Transformer-induced Progressive Refinement Module (TPRM) and a Semantic-Spatial Interaction Enhancement Module (SIEM). In TPRM, high-level features with rich semantic information are integrated through transformers as prior guidance, and then, it is sent to the refinement concurrency unit (RCU), and the accurately positioned feature area is obtained through a progressive refinement strategy. In SIEM, we perform feature interaction to localized-accurate semantic features and low-level features to obtain rich fine-grained clues and increase the symbolic power of boundary features. Extensive experiments on four widely used benchmark datasets (i.e., CAMO, CHAMELEON, COD10K, and NC4K) demonstrate that our TPRNet is an effective COD model and outperforms state-of-the-art models. The code is available https://github.com/zhangqiao970914/TPRNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Amit, S.N.K.B., Shiraishi, S., Inoshita, T., Aoki, Y.: Analysis of satellite images for disaster detection. In: 2016 IEEE International geoscience and remote sensing symposium (IGARSS), pp. 5189–5192. IEEE (2016)

  2. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)

  3. Bi, H., Wang, K., Lu, D., Wu, C., Wang, W., Yang, L.: C 2 net: a complementary co-saliency detection network. Vis. Comput. 37(5), 911–923 (2021)

    Article  Google Scholar 

  4. Bi, H., Zhang, C., Wang, K., Tong, J., Zheng, F.: Rethinking camouflaged object detection: models and datasets. IEEE Trans. Circuits Syst. Video Technol. (2021). https://doi.org/10.1109/TCSVT.2021.3124952

    Article  Google Scholar 

  5. Cui, Y., Cao, Z., Xie, Y., Jiang, X., Tao, F., Chen, Y.V., Li, L., Liu, D.: Dg-labeler and dgl-mots dataset: Boost the autonomous driving perception. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 58–67 (2022)

  6. Cui, Y., Yan, L., Cao, Z., Liu, D.: Tf-blender: Temporal feature blender for video object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8138–8147 (2021)

  7. Dong, B., Zhuge, M., Wang, Y., Bi, H., Chen, G.: Towards accurate camouflaged object detection with mixture convolution and interactive fusion. arXiv preprint arXiv:2101.056871(2) (2021)

  8. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  9. Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp. 4548–4557 (2017)

  10. Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421 (2018)

  11. Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3085766

    Article  Google Scholar 

  12. Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2777–2787 (2020)

  13. Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: Parallel reverse attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 263–273. Springer (2020)

  14. Fan, D.P., Zhou, T., Ji, G.P., Zhou, Y., Chen, G., Fu, H., Shen, J., Shao, L.: Inf-net: automatic covid-19 lung infection segmentation from ct images. IEEE Trans. Med. Imaging 39(8), 2626–2637 (2020)

    Article  Google Scholar 

  15. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3146–3154 (2019)

  16. Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)

    Article  Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)

  18. Hou, J.Y.Y.H.W., Li, J.: Detection of the mobile object with camouflage color under dynamic background based on optical flow. Procedia Eng. 15, 2201–2205 (2011)

    Article  Google Scholar 

  19. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)

  20. Ji, G.P., Zhu, L., Zhuge, M., Fu, K.: Fast camouflaged object detection via edge-based reversible re-calibration network. Pattern Recogn. 123, 108414 (2022)

    Article  Google Scholar 

  21. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  22. Le, T.N., Nguyen, T.V., Nie, Z., Tran, M.T., Sugimoto, A.: Anabranch network for camouflaged object segmentation. Comput. Vis. Image Underst. 184, 45–56 (2019)

    Article  Google Scholar 

  23. Le, X., Mei, J., Zhang, H., Zhou, B., Xi, J.: A learning-based approach for surface defect detection using small image datasets. Neurocomputing 408, 112–120 (2020)

    Article  Google Scholar 

  24. Li, A., Zhang, J., Lv, Y., Liu, B., Zhang, T., Dai, Y.: Uncertainty-aware joint salient object and camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10071–10081 (2021)

  25. Liu, D., Cui, Y., Chen, Y., Zhang, J., Fan, B.: Video object detection for autonomous driving: Motion-aid feature calibration. Neurocomputing 409, 1–11 (2020)

    Article  Google Scholar 

  26. Liu, D., Cui, Y., Tan, W., Chen, Y.: Sg-net: Spatial granularity network for one-stage video instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9816–9825 (2021)

  27. Liu, Z., Huang, K., Tan, T.: Foreground object detection using top-down information based on em framework. IEEE Trans. Image Process. 21(9), 4204–4217 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  28. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

  29. Lv, Y., Zhang, J., Dai, Y., Li, A., Liu, B., Barnes, N., Fan, D.P.: Simultaneously localize, segment and rank the camouflaged objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11591–11601 (2021)

  30. Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 248–255 (2014)

  31. Mei, H., Ji, G.P., Wei, Z., Yang, X., Wei, X., Fan, D.P.: Camouflaged object segmentation with distraction mining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8772–8781 (2021)

  32. Pan, Y., Chen, Y., Fu, Q., Zhang, P., Xu, X.: Study on the camouflaged target detection method based on 3d convexity. Mod. Appl. Sci. 5(4), 152 (2011)

    Article  Google Scholar 

  33. Pang, Y., Zhao, X., Zhang, L., Lu, H.: Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9413–9422 (2020)

  34. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: an imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019)

  35. Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 733–740. IEEE (2012)

  36. Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)

  37. Sengottuvelan, P., Wahi, A., Shanmugam, A.: Performance of decamouflaging through exploratory image analysis. In: 2008 First International Conference on Emerging Trends in Engineering and Technology, pp. 6–10. IEEE (2008)

  38. Skurowski, P., Abdulameer, H., Błaszczyk, J., Depta, T., Kornacki, A., Kozieł, P.: Animal camouflage analysis: Chameleon database. Unpublished manuscript 2(6), 7 (2018)

  39. Sun, Y., Chen, G., Zhou, T., Zhang, Y., Liu, N.: Context-aware cross-level fusion network for camouflaged object detection. arXiv preprint arXiv:2105.12555 (2021)

  40. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)

  41. Wang, D., Hu, G., Lyu, C.: Frnet: an end-to-end feature refinement neural network for medical image segmentation. Vis. Comput. 37(5), 1101–1112 (2021)

    Article  Google Scholar 

  42. Wang, K., Bi, H., Zhang, Y., Zhang, C., Liu, Z., Zheng, S.: D 2 c-net: a dual-branch, dual-guidance and cross-refine network for camouflaged object detection. IEEE Trans. Ind. Electron. 69, 5364 (2021)

    Article  Google Scholar 

  43. Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., Shao, L.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578 (2021)

  44. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803 (2018)

  45. Wang, X., Wang, W., Bi, H., Wang, K.: Reverse collaborative fusion model for co-saliency detection. The Visual Computer pp. 1–11 (2021)

  46. Wei, J., Wang, S., Huang, Q.: F\(^3\)net: fusion, feedback and focus for salient object detection. Proc. AAAI Conf. Artif. Intell. 34, . 12321-12328 (2020)

    Google Scholar 

  47. Wu, Y.H., Gao, S.H., Mei, J., Xu, J., Fan, D.P., Zhang, R.G., Cheng, M.M.: Jcs: an explainable covid-19 diagnosis system by joint classification and segmentation. IEEE Trans. Image Process. 30, 3113–3126 (2021)

    Article  Google Scholar 

  48. Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3907–3916 (2019)

  49. Wu, Z., Su, L., Huang, Q.: Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7264–7273 (2019)

  50. Xiao, H., Ran, Z., Mabu, S., Li, Y., Li, L.: Saunet++: an automatic segmentation model of covid-19 lesion from ct slices. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02414-4

    Article  Google Scholar 

  51. Yan, J., Le, T.N., Nguyen, K.D., Tran, M.T., Do, T.T., Nguyen, T.V.: Mirrornet: bio-inspired camouflaged object segmentation. IEEE Access 9, 43290–43300 (2021)

    Article  Google Scholar 

  52. Yang, F., Zhai, Q., Li, X., Huang, R., Luo, A., Cheng, H., Fan, D.P.: Uncertainty-guided transformer reasoning for camouflaged object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4146–4155 (2021)

  53. Youwei, P., Xiaoqi, Z., Tian-Zhu, X., Lihe, Z., Huchuan, L.: Zoom in and out: A mixed-scale triplet network for camouflaged object detection. arXiv preprint arXiv:2203.02688 (2022)

  54. Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Jiang, Z.H., Tay, F.E., Feng, J., Yan, S.: Tokens-to-token vit: Training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 558–567 (2021)

  55. Zhai, Q., Li, X., Yang, F., Chen, C., Cheng, H., Fan, D.P.: Mutual graph learning for camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12997–13007 (2021)

  56. Zhang, X., Wang, X., Gu, C.: Online multi-object tracking with pedestrian re-identification and occlusion processing. Vis. Comput. 37(5), 1089–1099 (2021)

    Article  Google Scholar 

  57. Zhang, Y., Han, S., Zhang, Z., Wang, J., Bi, H.: Cf-gan: cross-domain feature fusion generative adversarial network for text-to-image synthesis. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02404-6

    Article  Google Scholar 

  58. Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8779–8788 (2019)

  59. Zhuge, M., Lu, X., Guo, Y., Cai, Z., Chen, S.: Cubenet: X-shape connection for camouflaged object detection. Pattern Recogn. 127, 108644 (2022)

    Article  Google Scholar 

Download references

Acknowledgements

The paper is supported by AnHui Province Key Laboratory of Infrared and Low-Temperature Plasma under No. IRKL2022KF07.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Cong Zhang or Hongbo Bi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Ge, Y., Zhang, C. et al. TPRNet: camouflaged object detection via transformer-induced progressive refinement network. Vis Comput 39, 4593–4607 (2023). https://doi.org/10.1007/s00371-022-02611-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02611-1

Keywords

Navigation