Skip to main content

Exploring a Distillation with Embedded Prompts for Object Detection in Adverse Environments

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14434))

Included in the following conference series:

  • 450 Accesses

Abstract

Efficient and robust object detection in adverse environments is crucial and challenging for autonomous agents. The current mainstream approach is to use image enhancement or restoration as a means of image preprocessing to reduce the domain shift between adverse and regular scenes. However, these image-level methods cannot guide the model to capture the spatial and semantic information of object instances, resulting in only marginal performance improvements. To overcome this limitation, we explore a Prompts Embedded Distillation framework, called PED. Specifically, a spatial location prompt module is proposed to guide the model to learn the easily missed target position information. Considering the correlation between object instances in the scene, a semantic mask prompt module is proposed to constrain the global attention between instances, making each aggregated instance feature more discriminative. Naturally, we propose a teacher model with embedded cues and finally transfer the knowledge to the original student model through focal distillation. Extensive experimental results demonstrate the effectiveness and flexibility of our approach.

This work is partially supported by the National Natural Science Foundation of China (Nos. U22B2052).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen, D., et al.: Gated context aggregation network for image dehazing and deraining. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1375–1383. IEEE (2019)

    Google Scholar 

  2. Chen, K., et al.: MMDetection: open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)

  3. Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13039–13048 (2021)

    Google Scholar 

  4. Dong, H., et al.: Multi-scale boosted dehazing network with dense feature fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2157–2167 (2020)

    Google Scholar 

  5. Fu, Z., et al.: Unsupervised underwater image restoration: from a homology perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 643–651 (2022)

    Google Scholar 

  6. Hnewa, M., Radha, H.: Multiscale domain adaptive yolo for cross-domain object detection. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 3323–3327. IEEE (2021)

    Google Scholar 

  7. Islam, M.J., Xia, Y., Sattar, J.: Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 5(2), 3227–3234 (2020)

    Article  Google Scholar 

  8. Kalwar, S., Patel, D., Aanegola, A., Konda, K.R., Garg, S., Krishna, K.M.: GDIP: gated differentiable image processing for object-detection in adverse conditions. arXiv preprint arXiv:2209.14922 (2022)

  9. Li, B., et al.: Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 28(1), 492–505 (2018)

    Article  MathSciNet  Google Scholar 

  10. Lin, R., Liu, J., Liu, R., Fan, X.: Global structure-guided learning framework for underwater image enhancement. Vis. Comput. 1–16 (2021)

    Google Scholar 

  11. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  12. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  13. Liu, C., et al.: A dataset and benchmark of underwater object detection for robot picking. In: 2021 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pp. 1–6. IEEE (2021)

    Google Scholar 

  14. Liu, J., et al.: Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5802–5811 (2022)

    Google Scholar 

  15. Liu, J., Wu, G., Luan, J., Jiang, Z., Liu, R., Fan, X.: Holoco: holistic and local contrastive learning network for multi-exposure image fusion. Inf. Fusion 95, 237–249 (2023)

    Article  Google Scholar 

  16. Liu, R., Jiang, Z., Yang, S., Fan, X.: Twin adversarial contrastive learning for underwater image enhancement and beyond. IEEE Trans. Image Process. 31, 4922–4936 (2022)

    Article  Google Scholar 

  17. Liu, R., Li, S., Liu, J., Ma, L., Fan, X., Luo, Z.: Learning Hadamard-product-propagation for image dehazing and beyond. IEEE Trans. Circuits Syst. Video Technol. 31(4), 1366–1379 (2020)

    Article  Google Scholar 

  18. Liu, W., Ren, G., Yu, R., Guo, S., Zhu, J., Zhang, L.: Image-adaptive yolo for object detection in adverse weather conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1792–1800 (2022)

    Google Scholar 

  19. Liu, X., Ma, Y., Shi, Z., Chen, J.: Griddehazenet: attention-based multi-scale network for image dehazing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7314–7323 (2019)

    Google Scholar 

  20. Ma, L., Ma, T., Liu, R., Fan, X., Luo, Z.: Toward fast, flexible, and robust low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5637–5646 (2022)

    Google Scholar 

  21. Naik, A., Swarnakar, A., Mittal, K.: Shallow-UWnet: compressed model for underwater image enhancement (student abstract). In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 15853–15854 (2021)

    Google Scholar 

  22. Qin, X., Wang, Z., Bai, Y., Xie, X., Jia, H.: FFA-net: feature fusion attention network for single image dehazing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11908–11915 (2020)

    Google Scholar 

  23. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  24. Yang, Z., et al.: Focal and global knowledge distillation for detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4643–4652 (2022)

    Google Scholar 

  25. Zhang, Z., Jiang, Z., Liu, J., Fan, X., Liu, R.: Waterflow: heuristic normalizing flow for underwater image enhancement and beyond. ACM MM (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Risheng Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fu, H., Ma, L., Liu, J., Fan, X., Liu, R. (2024). Exploring a Distillation with Embedded Prompts for Object Detection in Adverse Environments. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14434. Springer, Singapore. https://doi.org/10.1007/978-981-99-8549-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8549-4_35

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8548-7

  • Online ISBN: 978-981-99-8549-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics