Abstract
As the ship multi-scale phenomenon is very common in visible video, it is an important factor affecting the performance of visible video ship detection. Based on YOLOv5s, this paper proposes a real-time multi-scale ship detection algorithm. First, reparameterized convolution is adopted to increase the network width, thereby enhancing the network’s ability to express multi-scale ship features. Second, the depth of SPPF (Spatial Pyramid Pooling Fast) is adjusted to enhance the scale invariance of ship features extracted from the network. Third, the attention module is combined with feature pyramid network to enhance the network’s ability to focus on multi-scale ship features. Finally, confidence propagation cluster is used for post-processing to make the network generation more confident and closer to the boundary box of the real box. The experiment shows that our method can achieve state-of-the-art visible video ship detection performance on multiple evaluation indicators, such as mAP-IOU@0.5, mAP-IOU@[0.5:0.95], APS, APM, APL and so on. And it can meet the requirements of real-time detection.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author, Yan-Tong Chen, upon reasonable request.
References
Wang, X., Chen, C.: Adaptive ship detection in SAR images using variance Wie-based method. Signal Image Video Process. 10, 1219–1224 (2016)
Liu, Z., Bai, X., Sun, C., Zhou, F., Li, Y.: Infrared ship target segmentation through integration of multiple feature maps. Image Vis. Comput. 48, 14–25 (2016)
Liu, B., Li, Y., Zhang, Q., Han, L.: The application of gf-1 imagery to detect ships on the Yangtze river. J. Indian Soc. Remote Sens. 45, 179–183 (2017)
He, J., Wang, Y., Liu, H., Wang, N.: Polsar ship detection using local scattering mechanism difference based on regression kernel. IEEE Geosci. Remote Sens. Lett. 14(10), 1725–1729 (2017)
Li, S., Zhou, Z., Wang, B., Wu, F.: A novel inshore ship detection via ship head classification and body boundary determination. IEEE Geosci. Remote Sens. Lett. 13(12), 1920–1924 (2016)
Wang, D., Cui, X., Xie, F., Jiang, Z., Shi, Z.: Multi-feature sea-land segmentation based on pixel-wise learning for optical remote-sensing imagery. Int. J. Remote Sens. 38(15), 4327–4347 (2017)
Bi, F., Chen, J., Zhuang, Y., Bian, M., Zhang, Q.: A decision mixture model-based method for inshore ship detection using high-resolution remote sensing images. Sensors 17(7), 1470 (2017)
Zhang, Y., Li, Q.-Z., Zang, F.-N.: Ship detection for visual maritime surveillance from non-stationary platforms. Ocean Eng. 141, 53–63 (2017)
Shao, Z., Wang, L., Wang, Z., Du, W., Wu, W.: Saliency-aware convolution neural network for ship detection in surveillance video. IEEE Trans. Circuits Syst. Video Technol. 30(3), 781–794 (2020)
Liu, R.W., Yuan, W., Chen, X., Lu, Y.: An enhanced CNN-enabled learning method for promoting ship detection in maritime surveillance system. Ocean Eng. 235, 109435 (2021)
Li, H., Deng, L., Yang, C., Liu, J., Gu, Z.: Enhanced yolo v3 tiny network for real-time ship detection from visual image. IEEE Access 9, 16692–16706 (2021)
Meng, H., Yuan, F., Tian, Y., Wei, H.: A ship detection method in complex background via mixed attention model. Arab. J. Sci. Eng. 47(8), 9505–9525 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Farhadi, A., Redmon, J.: Yolov3: An incremental improvement. In: Computer Vision and Pattern Recognition, vol. 1804, pp. 1–6. Springer, Berlin (2018)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: Making vgg-style convnets great again. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Shen, Y., Jiang, W., Xu, Z., Li, R., Kwon, J., Li, S.: Confidence propagation cluster: Unleash full potential of object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1151–1161 (2022)
Shao, Z., Wu, W., Wang, Z., Du, W., Li, C.: Seaships: A large-scale precisely annotated dataset for ship detection. IEEE Trans. multimed. 20(10), 2593–2604 (2018)
Prasad, D.K., Rajan, D., Rachmawati, L., Rajabally, E., Quek, C.: Video processing from electro-optical sensors for object detection and tracking in a maritime environment: A survey. IEEE Trans. Intell. Transport. Syst. 18(8), 1993–2016 (2017)
Zhou, J., Jiang, P., Zou, A., Chen, X., Hu, W.: Ship target detection algorithm based on improved yolov5. J. Mar. Sci. Eng. 9(8), 908 (2021)
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., et al.: Yolov6: A single-stage object detection framework for industrial applications. Preprint at http://arxiv.org/abs/2209.02976 (2022)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Preprint at http://arxiv.org/abs/2207.02696 (2022)
Acknowledgements
This work was supported by the National Natural Science Foundation of China [Grant No.: 61901081]; China Postdoctoral Science Foundation [Grant No.: 2020M680927]. Fundamental Research Funds for the Central Universities [Grant No.: 3132022237].
Author information
Authors and Affiliations
Contributions
All authors contributed to the conception and design of the study, interpretation of data, and drafting of the manuscript. YC designed and implemented the study protocol, conducted the literature review, collected and analyzed the data, and contributed to writing the manuscript. YZ designed and implemented the study protocol, collected data, and contributed to writing the manuscript. JW conducted analyses, provided expertise on the theoretical framework , interpreted the results, and contributed to writing the manuscript. YL provided expertise on the theoretical framework and contributed to writing the manuscript. The order of authors reflects their relative contributions to the project.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, YT., Zhang, YY., Wang, JL. et al. Yolov5s-MSD: a multi-scale ship detector for visible video image. Multimedia Systems 30, 3 (2024). https://doi.org/10.1007/s00530-023-01196-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-023-01196-6