Abstract
The technology for detecting maritime targets is crucial for realizing ship intelligence. However, traditional detection algorithms are not ideal due to the diversity of marine targets and complex background environments. Therefore, we choose YOLOv7 as the baseline and propose an end-to-end feature fusion and feature enhancement YOLO (FE-YOLO). First, we introduce channel attention and lightweight Ghostconv into the extended efficient layer aggregation network of YOLOv7, resulting in the improved extended efficient layer aggregation network (IELAN) module. This improvement enables the model to capture context information better and thus enhance the target features. Second, to enhance the network’s feature fusion capability, we design the light spatial pyramid pooling combined with the spatial channel pooling (LSPPCSPC) module and the coordinate attention feature pyramid network (CA-FPN). Furthermore, we develop an N-Loss based on normalized Wasserstein distance (NWD), effectively addressing the class imbalance issue in the ship dataset. Experimental results on the open-source Singapore maritime dataset (SMD) and SeaShips dataset demonstrate that compared to the baseline YOLOv7, FE-YOLO achieves an increase of 4.6% and 3.3% in detection accuracy, respectively.
Similar content being viewed by others
Data availability
All data included in this study are available upon request by contact with the corresponding author.
References
Bai, X.G., Li, B.H., Xu, X.F., Xiao, Y.J.: A review of current research and advances in unmanned surface vehicles. J. Mar. Sci. Appl. 21, 47–58 (2022)
Yuan, S.Y., Li, Y., Bao, F.W., Xu, H.X., Yang, Y.P., Yan, Q.S., Zhong, S.Q., Yin, H.Y., Xu, J.J., Huang, Z.W., et al.: Marine environmental monitoring with unmanned vehicle platforms: present applications and future prospects. Sci. Tot. Environ. 858, 15941 (2023)
Zhang, D., Robert, Y.Z.: Comparison of two deep learning methods for ship target recognition with optical remotely sensed data. Neural Comput. Appl. 33, 4639–4649 (2021)
Ren, Y., Yang, J., Zhang, Q., Guo, Z.: Ship recognition based on Hu invariant moments and convolutional neural network for video surveillance. Multimed. Tools Appl. 80, 1343–1373 (2020)
Yang, D.F., Solihin, M.I., Zhao, Y.W., Yao, B.C., Chen, C.R., Cai, B.Y., Machmudah, A.: A review of intelligent ship marine object detection based on RGB camera. Iet Image Process. 18, 281–297 (2023)
Hu, B., Liu, X., Jing, Q., Lyu, H., Yin, Y.: Estimation of berthing state of maritime autonomous surface ships based on 3D LiDAR. Ocean Eng. 251, 111131 (2022)
Hao, Y., Zheng, P., Han, Z.: Automatic generation of water route based on AIS big data and ECDIS. Arab. J. Geosci. 14, 1–8 (2021)
Zhang, J.M., Zou, X., Kuang, L.D., Wang, J., Sherratt, R.S., Yu, X.F.: CCTSDB 2021: A More Comprehensive Traffic Sign Detection Benchmark. Human-Centric Comput. Inf. Sci. 12 (2022)
Zhang, J.M., Zheng, Z.F., Xie, X.D., Gui, Y., Kim, J.: ReYOLO: a traffic sign detector based on network reparameterization and features adaptive weighting. J. Ambient Intell. Smart Environ. 14, 317–334 (2022)
Zhang, J.M., Ye, Z., Jin, X.K., Wang, J., Zhang, J.: Real-time traffic sign detection based on multiscale attention and spatial information aggregator. J. Real-Time Image Proc. 19, 1155–1167 (2022)
Nazir, S., Kaleem, M.: Federated learning for medical image analysis with deep neural networks. Diagnostics 13, 1532 (2023)
Zhou, T., Cheng, Q.R., Lu, H.L., Li, Q., Zhang, X.X., Qiu, S.: Deep learning methods for medical image fusion: a review. Comput. Biol. Med. 160, 106959 (2023)
Everingham, M., Eslami, S., Van Gool, L., Williams, C.K., Winn, J.: Zisserman: the pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111, 98–136 (2015)
Girshick, R., Ieee.Fast R-CNN. IEEE International Conference on Computer Vision, 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)
Bochkovskiy, A., Wang, C.-Y., Liao, H.Y.M.: YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint arXiv: 2004.10934 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 (2016)
Redmon, J., Farhadi, A.:YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M., Ieee.YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7464–7475 (2023)
Wei, L., Dragomir, A., Dumitru, E., Christian, S., Scott, R., Cheng-Yang, F., Berg, C.: SSD: single shot multibox detector. Springer, Cham (2016)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.J.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv: 2107.08430 (2021)
Huang, Lichao , et al. DenseBox: Unifying Landmark Localization with End to End Object Detection. Computer Science (2015)
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp. 734–750 (2018)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 9627–9636 (2019)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6569–6578 (2019)
Han, K., Wang, Y.H., Tian, Q., Guo, J.Y., Xu, C.J., Xu, C., Ieee.GhostNet: More Features from Cheap Operations. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1577-1586 (2020)
Hou, Q.B., Zhou, D.Q., Feng, J.S., Ieee Comp, S.O.C.: Coordinate Attention for Efficient Mobile Network Design. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13708-13717 (2021)
Tan, M., Pang, R., Le, Q.V.: EfficientDet: Scalable and Efficient Object Detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Wang, J., Xu, C., Yang, W., Yu, L.: A normalized gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389 (2021)
Sun, X.Q., Liu, T., Yu, X.P., Pang, B.: Unmanned Surface Vessel Visual Object Detection Under All-Weather Conditions with Optimized Feature Fusion Network in YOLOv4. J. Intell. Robot. Syst. 103 (2021)
Liu, T., Pang, B., Zhang, L., Yang, W., Sun, X.Q.: Sea Surface Object Detection Algorithm Based on YOLO v4 Fused with Reverse Depthwise Separable Convolution (RDSC) for USV. J. Mar. Sci. Eng. 9 (2021)
Tao, L., Pang, B., Ai, S.M., Sun, X.Q.: Study on Visual Detection Algorithm of Sea Surface Targets Based on Improved YOLOv3. Sensors 20 (2020)
Zerrouk, I., Moumen, Y., Khiati, W., El Habchi, A., Berrich, J., Bouchentouf, T.: Evolutionary algorithm for optimized CNN architecture search applied to real-time boat detection in aerial images. J Real-Time Image Process 20, 4 (2023)
Tian, Y., Meng, H., Yuan, F.: Multiscale and multilevel enhanced features for ship target recognition in complex environments. IEEE Trans Ind. Inf. 20, 4640–4650 (2023)
Deng, L.W., Liu, Z., Wang, J.D., Yang, S.: ATT-YOLOv5-Ghost: water surface object detection in complex scenes. J. Real-Time Image Process. 20, 5 (2023)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Prasad, D.K., Rajan, D., Rachmawati, L., Rajabally, E., Quek, C.J.I.T.: Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey. Trans. Intell. Transp. Syst. 18, 1993–2016 (2017)
Shao, Z., Wu, W., Wang, Z., Du, W., Li, C.: SeaShips: a large-scale precisely-annotated dataset for ship detection. IEEE Trans. Multimed 20, 1–1 (2018)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. Springer, Cham (2020)
Ultralytics: The code address. https://github.com/ultralytics/yolov5 (2022)
Ultralytics: The code address. https://github.com/ultralytics/ultralytics (2023)
Aharon, S., Louis-Dupont, M.O., Yurkova, K., Fridman, L., Lkdci, K., Eugene, R., et al.: The code address. https://github.com/Deci-AI/super-gradients (2021)
Acknowledgements
This work is supported by the project of National Key R&D Program of China (Grant: 2019YFE0105400).
Funding
This study was funded by National Key R&D Program of China, 2019YFE0105400, 2019YFE0105400, 2019YFE0105400.
Author information
Authors and Affiliations
Contributions
Shouwen Cai presented the conceptual framework, and both Shouwen Cai and Junbao Wu executed the experiment and authored the paper. Hao Meng subsequently reviewed and enhanced the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies involving human participants/animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cai, S., Meng, H. & Wu, J. FE-YOLO: YOLO ship detection algorithm based on feature fusion and feature enhancement. J Real-Time Image Proc 21, 61 (2024). https://doi.org/10.1007/s11554-024-01445-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-024-01445-5