An underwater target recognition algorithm incorporating improved attention mechanism and downsampling

Zhu, QiGuang; Cen, Qiang; Wang, YuXin; Chen, WeiDong; Liu, Shuo

doi:10.1007/s00371-024-03437-9

An underwater target recognition algorithm incorporating improved attention mechanism and downsampling

Research
Published: 13 May 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

QiGuang Zhu^1,2,
Qiang Cen¹,
YuXin Wang¹,
WeiDong Chen^1,2 &
…
Shuo Liu^1,2

Abstract

To address the issue of low accuracy in recognizing underwater targets due to dense and blurred targets in underwater target detection, we propose a joint improved attention mechanism and downsampling network for underwater target detection. Firstly, to address the issue of dense targets, we introduce an improved channel attention module. This module enhances attention to spatial dimension information, highlights the saliency of feature maps of different channels and improves the detection ability of dense targets. Secondly, to address the issue of blurred underwater targets, we introduce a down-sampling module that combines same-layer connections and cross-layer skipping. This module reduces information loss caused by convolutional down-sampling and integrates features from different layers more fully. By improving the feature expression of the underwater image, the network’s detection accuracy for underwater blurred targets is further enhanced. Finally, the study introduces the focus loss function to address the imbalance of positive and negative samples. This function dynamically reduces the weight of easy-to-distinguish samples during training and prioritizes difficult-to-distinguish samples. Experimental results demonstrate a 2.71% increase in average accuracy of the improved algorithm on the DUO dataset. Additionally, the calculation amount is reduced by 9.1 GFLOPs, and the parameter amount is reduced by 5.44 M. Code:https://figshare.com/articles/dataset/improved-yolov5/25375129. Dataset:https://figshare.com/articles/dataset/DUO_zip/25370527.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Underwater object detection based on enhanced YOLOv4 architecture

Article 24 November 2023

Underwater target detection with an attention mechanism and improved scale

Article 25 August 2021

Lightweight underwater object detection based on image enhancement and multi-attention

Article 10 January 2024

Data Availability

The dataset can be accessed at the following URL: https://figshare.com/articles/dataset/DUO_zip/25370527 [32]. The code can be accessed at the following URL: https://figshare.com/articles/dataset/improved-yolov5/25375129 [33].

References

Hou, W., Jing, H.: Rc-yolov5s: for tile surface defect detection. Vis. Comput. 40, 459–470 (2024)
Article Google Scholar
Sun, X., Shi, J., Liu, L., et al.: Transferring deep knowledge for object recognition in low-quality underwater videos. Neurocomputing 275, 897–908 (2018)
Article Google Scholar
Li, J., Chen, J., Sheng, B., et al.: Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Ind. Inform. 18(1), 163–173 (2022)
Article Google Scholar
Wang, N., Chen, T., Liu, S., et al.: Deep learning-based visual detection of marine organisms: a survey. Neurocomputing 532, 1–32 (2023)
Article Google Scholar
Qiao, X., Bao, J., Zeng, L., et al.: An automatic active contour method for sea cucumber segmentation in natural underwater environments. Comput. Electron. Agric. 135, 134–142 (2017)
Article Google Scholar
Liu, H., Xu, Q., Liu, S., et al.: Evaluation of body weight of sea cucumber apostichopus japonicus by computer vision. Chin. J. Oceanol. Limnol. 33(1), 114–120 (2015)
Article Google Scholar
Khan, A., Fouda, M.M., Do, D.-T., et al.: Underwater target detection using deep learning: methodologies, challenges, applications, and future evolution. IEEE Access 12, 12618–12635 (2024)
Article Google Scholar
Liu, D., Cui, Y., Tan, W., et al.: Sg-net: Spatial granularity network for one-stage video instance segmentation. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 9811–9820 (2021)
Cui, Y., Yan, L., Cao, Z., et al.: Tf-blender: Temporal feature blender for video object detection. In: Proc. IEEE Int. Conf. Comput. Vis., pp. 8118–8127 (2021)
Cheng, B., Wei, Y., Shi, H., et al.: Revisiting rcnn: On awakening the classification power of faster rcnn. In: Lect. Notes Comput. Sci., pp. 473–490 (2018)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards realtime object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
He, K., Gkioxari, G., Dollar, P., et al.: Mask r-cnn. In: Proc IEEE Int Conf Comput Vision, pp. 2980–2988 (2017)
Zeng, L., Sun, B., Zhu, D.: Underwater target detection based on faster r-cnn and adversarial occlusion network. Eng. Appl. Artif. Intell. 100, 104190 (2021)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: Unified, real-time object detection. In: Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, pp. 779–788 (2016)
He, K., Gkioxari, G., Dollar, P., et al.: Yolov3: An incremental improvement. In: Proc IEEE Int Conf Comput Vision, pp. 2980–2988 (2018)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, pp. 7464–7475 (2023)
Redmon, J., Farhadi, A.: Yolo9000: Better, faster, stronger. In: Proc. IEEE Conf.Comput. Vis. Pattern Recognit., pp. 6517–6525 (2017)
Lin, X., Sun, S., Huang, W., et al.: Eapt: Efficient attention pyramid transformer for image processing. IEEE Trans. Multim. 25, 50–61 (2023)
Article Google Scholar
Li, X., Yu, H., Chen, H.: Multi-scale aggregation feature pyramid with cornerness for underwater object detection. Vis. Comput. 40, 1299–1310 (2024)
Article Google Scholar
Xie, Z., Zhang, W., Sheng, B., et al.: Bagfn: Broad attentive graph fusion network for high-order feature interactions. IEEE Trans. Neural Netw. Learn. Syst. 34(8), 4499–4513 (2023)
Article Google Scholar
Yang, Y., Chen, L., Zhang, J., et al.: Ugc-yolo: underwater environment object detection based on yolo with a global context block. J. Ocean Univ. China 22(3), 665–674 (2023)
Article Google Scholar
Liu, D., Cui, Y., Yan, L., et al.: Densernet: Weakly supervised visual localization using multi-scale feature aggregation. In: Proc. AAAI Conf. Artif. Intell., vol. 35, pp. 6101–6109 (2021)
Wang, W., Han, C., Zhou, T., et al.: Visual recognition with deep nearest centroids. In: Proc. Int. Conf. Learn. Represent. (2023)
Sun, Y., Zheng, W., Du, X., et al.: Underwater small target detection based on yolox combined with mobilevit and double coordinate attention. J. Mar. Sci. Eng. 11(6), 1178 (2023)
Article Google Scholar
Chen, Z., Qiu, G., Li, P., et al.: Mngnas: distilling adaptive combination of multiple searched networks for one-shot neural architecture search. IEEE Trans. Pattern Anal. Mach. Intell. 45(11), 13489–13508 (2023)
Google Scholar
Hu, J., Shen, L., Albanie, S., et al.: Squeeze-and-excitation networks. In: Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, pp. 7132–7141 (2018)
Liu, C., Li, H., Wang, S., et al.: A dataset and benchmark of underwater object detection for robot picking. In: IEEE Int. Conf. Multimed. Expo Workshops, ICMEW, pp. 1–6 (2021)
Everingham, M., Eslami, S.M.A., Van Gool, L., et al.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)
Article Google Scholar
Fu, C., Liu, R., Fan, X., et al.: Rethinking general underwater object detection: datasets, challenges, and solutions. Neurocomputing 517, 243–256 (2023)
Article Google Scholar
Jungseok, H., Michael, F., Junaed, S.: TrashCan: A Semantically-Segmented Dataset towards Visual Detection of Marine Debris. Preprint at arXiv:2007.08097 (2020)
Liu, C., Wang, Z., Wang, S., et al.: A new dataset, poisson gan and aquanet for underwater object grabbing. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2831–2844 (2022)
Article Google Scholar
Zhu, Q., Cen, Q., Wang, Y., et al.: Duo.zip. figshare.dataset (2024). https://doi.org/10.6084/m9.figshare.25370527.v1
Zhu, Q., Cen, Q., Wang, Y., et al.: improved-yolov5. figshare. dataset (2024). https://doi.org/10.6084/m9.figshare.25375129.v1

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grants 61773333 and 62273296.

Author information

Authors and Affiliations

School of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, 066004, Hebei, China
QiGuang Zhu, Qiang Cen, YuXin Wang, WeiDong Chen & Shuo Liu
The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Hebei Street, Qinhuangdao, 066004, Hebei, China
QiGuang Zhu, WeiDong Chen & Shuo Liu

Authors

QiGuang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Cen
View author publications
You can also search for this author in PubMed Google Scholar
YuXin Wang
View author publications
You can also search for this author in PubMed Google Scholar
WeiDong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiang Cen.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhu, Q., Cen, Q., Wang, Y. et al. An underwater target recognition algorithm incorporating improved attention mechanism and downsampling. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03437-9

Download citation

Accepted: 24 April 2024
Published: 13 May 2024
DOI: https://doi.org/10.1007/s00371-024-03437-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An underwater target recognition algorithm incorporating improved attention mechanism and downsampling

Abstract

Access this article

Similar content being viewed by others

Underwater object detection based on enhanced YOLOv4 architecture

Underwater target detection with an attention mechanism and improved scale

Lightweight underwater object detection based on image enhancement and multi-attention

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An underwater target recognition algorithm incorporating improved attention mechanism and downsampling

Abstract

Access this article

Similar content being viewed by others

Underwater object detection based on enhanced YOLOv4 architecture

Underwater target detection with an attention mechanism and improved scale

Lightweight underwater object detection based on image enhancement and multi-attention

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation