Skip to main content
Log in

Ship detection with deep learning: a survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Ship detection plays a pivotal role in efficient marine monitoring, port management, and safe navigation. However, the development of ship detection techniques is vastly behind other detection techniques, such as face detection, pedestrian detection, traffic sign/light detection, text detection, etc. In this paper, we explore the status quo and identify the following reasons for the slow development: (1) the existing methodologies are weakly systematic; (2) there are no unified evaluation criteria; (3) there are no widely accepted datasets which vastly hinder its development in deep learning era. In this context, we conduct a critical review of the state-of-the-art ship detection techniques based on deep learning. The main contributions of this work are: (1) existing works on object detection are comprehensively reviewed; (2) popular/benchmark datasets are extensively collected and analysed; (3) evaluation criteria for ship detection are ultimately unified; and (4) challenges and optimization methods are discussed and future directions projected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31

Similar content being viewed by others

Notes

  1. https://cocodataset.org/#detection-eval.

  2. https://scihub.copernicus.eu/

  3. https://github.com/ultralytics/yolov5.

  4. https://www.kaggle.com/c/airbus- ship- detection.

References

  • Adelson EH, Anderson CH, Bergen JR, Burt PJ, Ogden JM (1984) Pyramid methods in image processing. RCA Engineer 29(6):33–41

    Google Scholar 

  • Aksoy T, Halici U (2022) Analysis of visual reasoning on one-stage object detection. arXiv:2202.13115

  • Bar A, Wang X, Kantorov V, Reed C.J, Herzig R, Chechik G, Rohrbach A, Darrell T, Globerson A (2021) Detreg: Unsupervised pretraining with region priors for object detection. arXiv:2106.04550

  • Beal J, Kim E, Tzeng E, Park DH, Zhai A, Kislyuk D (2020) Toward transformer-based object detection. arXiv:2012.09958

  • Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2874–2883

  • Bloisi D, Iocchi L (2009) Argos-a video surveillance system for boat traffic monitoring in venice. Int J Pattern Recognit Artif Intell 23(07):1477–1502

    Google Scholar 

  • Bloisi D.D, Iocchi L, Pennisi A, Tombolini L (2015) Argos-venice boat classification. In: Proceedings of the IEEE international conference on advanced video and signal based surveillance, pp 1–6

  • Bo L, Xiaoyang X, Xingxing W, Wenting T (2021) Ship detection and classification from optical remote sensing images: a survey. Chin J Aeronaut 34(3):145–163

    Google Scholar 

  • Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934

  • Bovcon B, Perš J, Kristan M et al (2018) Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation. Robot Auton Syst 104:1–13

    Google Scholar 

  • Bowles C, Chen L, Guerrero R, Bentley P, Gunn R, Hammers A, Dickie DA, Hernández MV, Wardlaw J, Rueckert D (2018) Gan augmentation: Augmenting training data using generative adversarial networks. arXiv:1810.10863

  • Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  • Cao G, Xie X, Yang W, Liao Q, Shi G, Wu J (2018) Feature-fused ssd: Fast detection for small objects. Proceedings of the Ninth International Conference on Graphic and Image Processing 10615:381–388

    Google Scholar 

  • Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: Proceedings of the European conference on computer vision, pp 213–229

  • Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912–9924

    Google Scholar 

  • Chen C, Liu M-Y, Tuzel O, Xiao J (2016) R-cnn for small object detection. In: Proceedings of the Asian Conference on Computer Vision, pp 214–230

  • Chen G, Choi W, Yu X, Han T, Chandraker M (2017) Learning efficient object detection models with knowledge distillation. Adv Neural Inf Process Syst 30:742–751

    Google Scholar 

  • Chen K, Wu M, Liu J, Zhang C (2020) Fgsd: a dataset for fine-grained ship detection in high resolution satellite images. arXiv:2003.06832

  • Choi J, Chun D, Kim H, Lee H-J (2019) Gaussian yolov3: an accurate and fast object detector using localization uncertainty for autonomous driving. In: Proceedings of the international conference on computer vision, pp 502–511

  • Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1251–1258

  • Chu X, Tian Z, Wang Y, Zhang B, Ren H, Wei X, Xia H, Shen C (2021) Twins: Revisiting the design of spatial attention in vision transformers. Adv Neural Inf Process Syst 34:9355–9366

    Google Scholar 

  • Chu X, Zhang B, Xu R (2020) Multi-objective reinforced evolution in mobile neural architecture search. In: Proceedings of the European Conference on Computer Vision, pp 99–113

  • Corbane C, Pecoul E, Demagistri L, Petit M (2008) Fully automated procedure for ship detection using optical satellite imagery. Remote Sens Inland Coastal Oceanic Waters 7150:146–158

    Google Scholar 

  • Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: Practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 702–703

  • Cui Z, Li Q, Cao Z, Liu N (2019) Dense attention pyramid networks for multi-scale ship detection in SAR images. IEEE Trans Geosci Remote Sens 57(11):8983–8997

    Google Scholar 

  • Cui Z, Wang X, Liu N, Cao Z, Yang J (2020) Ship detection in large-scale sar images via spatial shuffle-group enhance attention. IEEE Trans Geosci Remote Sens 59(1):379–391

    Google Scholar 

  • Dai Z, Cai B, Lin Y, Chen J (2021) Up-detr: Unsupervised pre-training for object detection with transformers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1601–1610

  • Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. Proc IEEE Conf Comput Vis Pattern Recognit 1:886–893

    Google Scholar 

  • Deng C, Wang M, Liu L, Liu Y, Jiang Y (2021) Extended feature pyramid network for small object detection. IEEE Trans Multimedia 24:1968–1979

    Google Scholar 

  • Divvala SK, Hoiem D, Hays JH, Efros AA, Hebert M (2009) An empirical study of context in object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1271–1278

  • Doersch C (2016) Tutorial on variational autoencoders. arXiv:1606.05908

  • Donahue J, Krähenbühl P, Darrell T (2016) Adversarial feature learning. arXiv:1605.09782

  • Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929

  • Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the international conference on computer vision, pp 6569–6578

  • Elsken T, Metzen JH, Hutter F (2019) Neural architecture search: A survey. The Journal of Machine Learning Research 20(1):1997–2017

    MathSciNet  MATH  Google Scholar 

  • Engstrom L, Tran B, Tsipras D, Schmidt L, Madry A (2019) A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations. https://openreview.net/forum?id=BJfvknCqFQ

  • Erhan D, Courville A, Bengio Y, Vincent P (2010) Why does unsupervised pre-training help deep learning? In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp 201–208

  • Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88(2):303–338

    Google Scholar 

  • Everingham M, Eslami S, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vision 111(1):98–136

    Google Scholar 

  • Fang Y, Liao B, Wang X, Fang J, Qi J, Wu R, Niu J, Liu W (2021) You only look at one sequence: Rethinking transformer in vision through object detection. Adv Neural Inf Process Syst 34:26183–26197

    Google Scholar 

  • Fedus W, Rosca M, Lakshminarayanan B, Dai AM, Mohamed S, Goodfellow I (2017) Many paths to equilibrium: Gans do not need to decrease a divergence at every step. arXiv:1710.08446

  • Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–8

  • Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Google Scholar 

  • Felzenszwalb PF, Girshick RB, McAllester D (2010) Cascade object detection with deformable part models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2241–2248

  • Fu J, Sun X, Wang Z, Fu K (2020) An anchor-free method based on feature balancing and refinement network for multiscale ship detection in sar images. IEEE Trans Geosci Remote Sens 59(2):1331–1344

    Google Scholar 

  • Gao P, Zheng M, Wang X, Dai J, Li H (2021) Fast convergence of detr with spatially modulated co-attention. In: Proceedings of the International Conference on Computer Vision, pp 3621–3630

  • Girshick R (2015) Fast r-cnn. In: Proceedings of the international conference on computer vision, pp 1440–1448

  • Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  • Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv:1412.6572

  • Guo H, Yang X, Wang N, Song B, Gao X (2020) A rotational libra R-CNN method for ship detection. IEEE Trans Geosci Remote Sens 58(8):5772–5781

    Google Scholar 

  • Guo H, Yang X, Wang N, Gao X (2021) A centernet++ model for ship detection in SAR images. Pattern Recogn 112:107787–107796

    Google Scholar 

  • Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv:1510.00149

  • Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y (2021) Transformer in transformer. Adv Neural Inf Process Syst 34:15908–15919

    Google Scholar 

  • Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tang Y, Xiao A, Xu C, Xu Y et al (2022) A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell 01:1–23

    Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hosang J, Benenson R, Dollár P, Schiele B (2015) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830

    Google Scholar 

  • Hou X, Ao W, Song Q, Lai J, Wang H, Xu F (2020) Fusar-ship: building a high-resolution sar-ais matchup dataset of gaofen-3 for ship detection and recognition. Sci China Inf Sci 63(4):1–19

    Google Scholar 

  • Hsu C-H, Chang S-H, Liang J-H, Chou H-P, Liu C-H, Chang S-C, Pan J-Y, Chen Y-T, Wei W, Juan D-C (2018) Monas: Multi-objective neural architecture search using reinforcement learning. arXiv:1806.10332

  • Huang G, Liu S, Van der Maaten L, Weinberger KQ (2018) Condensenet: An efficient densenet using learned group convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2752–2761

  • Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. Adv Neural Inf Process Syst 29:4114–4122

    MATH  Google Scholar 

  • Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  • Iancu B, Soloviev V, Zelioli L, Lilius J (2021) Aboships-an inshore and offshore maritime vessel detection dataset with precise annotations. Remote Sens 13(5):988–1004

    Google Scholar 

  • Jeong J, Park H, Kwak N (2017) Enhancement of ssd by concatenating feature maps for object detection. arXiv:1705.09587

  • Joseph SIT, Sasikala J, Juliet DS (2019) Ship detection and recognition for offshore and inshore applications: a survey. Int J Intell Unmanned Syst 7:177–188

    Google Scholar 

  • Kang M, Ji K, Leng X, Lin Z (2017) Contextual region-based convolutional neural network with multilayer fusion for SAR ship detection. Remote Sensing 9(8):860–873

    Google Scholar 

  • Kang M, Leng X, Lin Z, Ji K (2017) A modified faster r-cnn based on cfar algorithm for sar ship detection. In: Proceedings of the international workshop on remote sensing with intelligent processing, pp 1–4

  • Kanjir U, Greidanus H, Oštir K (2018) Vessel detection and classification from spaceborne optical images: a literature survey. Remote Sens Environ 207:1–26

    Google Scholar 

  • Kim K.-H, Hong S, Roh B, Cheon Y, Park M (2016) Pvanet: Deep but lightweight neural networks for real-time object detection. arXiv:1608.08021

  • Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5936–5944

  • Kristan M, Perš J, Sulič V, Kovačič S (2014) A graphical model for rapid obstacle image-map estimation from unmanned surface vehicles. In: Proceedings of the Asian conference on computer vision, pp 391–406

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25

  • Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A et al (2020) The open images dataset v4. Int J Comput Vision 128(7):1956–1981

    Google Scholar 

  • Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision, pp 734–750

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Google Scholar 

  • Lee K, Choi J, Jeong J, Kwak N (2017) Residual features and unified prediction network for single stage detection. arXiv:1707.05031

  • Li Z, Zhou F.: Fssd: Feature fusion single shot multibox detector. arXiv:1712.00960 (2017)

  • Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2016) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954

    Google Scholar 

  • Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv:1608.08710

  • Li Q, Jin S, Yan J (2017) Mimicking very efficient network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6356–6364

  • Li J, Qu C, Shao J (2017) Ship detection in sar images based on an improved faster r-cnn. In: 2017 SAR in big data era: models, methods and applications (BIGSARDATA), pp 1–6

  • Li S, Chen Y, Peng Y, Bai L (2018) Learning more robust features with adversarial training. arXiv:1804.07757

  • Li Z, Chen Y, Yu G, Deng Y (2018) R-fcn++: Towards accurate region-based fully convolutional networks for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 7073–7080

  • Li X, Hu X, Yang J (2019) Spatial group-wise enhance: improving semantic feature learning in convolutional networks. arXiv:1905.09646

  • Li H, Deng L, Yang C, Liu J, Gu Z (2021) Enhanced yolo v3 tiny network for real-time ship detection from visual image. IEEE Access 9:16692–16706

    Google Scholar 

  • Li S, Fu X, Dong J (2022) Improved ship detection algorithm based on yolox for sar outline enhancement image. Remote Sens 14(16):4070–4087

    Google Scholar 

  • Lim J-S, Astrid M, Yoon H-J, Lee S-I (2021) Small object detection using context and attention. In: Proceedings of the International Conference on Artificial Intelligence in Information and Communication, pp 181–186

  • Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Proceedings of the European conference on computer vision, pp 740–755

  • Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  • Lin T.-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the international conference on computer vision, pp 2980–2988

  • Lin X, Zhao C, Pan W (2017) Towards accurate binary convolutional neural network. Adv Neural Inf Process Syst 30:344–352

    Google Scholar 

  • Lin Z, Ji K, Leng X, Kuang G (2018) Squeeze and excitation rank faster R-CNN for ship detection in SAR images. IEEE Geosci Remote Sens Lett 16(5):751–755

    Google Scholar 

  • Lin Z, Shi Y, Xue Z (2022) Idsgan: Generative adversarial networks for attack generation against intrusion detection. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp 79–91

  • Liu G, Zhang Y, Zheng X, Sun X, Fu K, Wang H (2013) A new method on inshore ship detection in high-resolution satellite images using shape and context information. IEEE Geosci Remote Sens Lett 11(3):617–621

    Google Scholar 

  • Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Proceedings of the European conference on computer vision, pp 21–37

  • Liu Z, Hu J, Weng L, Yang Y (2017) Rotated region based cnn for ship detection. In: Proceedings of the IEEE international conference on image processing, pp 900–904

  • Liu Z, Yuan L, Weng L, Yang Y (2017) A high resolution optical satellite image dataset for ship recognition and some new baselines. Proc Int Conf Pattern Recogn Appl Methods 2:324–331

    Google Scholar 

  • Liu W, Ma L, Chen H (2018) Arbitrary-oriented ship detection framework in optical remote-sensing images. IEEE Geosci Remote Sens Lett 15(6):937–941

    Google Scholar 

  • Liu Y, Wang R, Shan S, Chen X (2018) Structure inference net: Object detection using scene-level context and instance-level relationships. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6985–6994

  • Liu RW, Yuan W, Chen X, Lu Y (2021) An enhanced CNN-enabled learning method for promoting ship detection in maritime surveillance system. Ocean Eng 235:109435

    Google Scholar 

  • Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the International Conference on Computer Vision, pp 10012–10022

  • Liu S, Kong W, Chen X, Xu M, Yasir M, Zhao L, Li J (2022) Multi-scale ship detection algorithm based on a lightweight neural network for spaceborne sar images. Remote Sens 14(5):1149–1168

    Google Scholar 

  • Lu X, Li B, Yue Y, Li Q, Yan J (2019) Grid r-cnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7363–7372

  • Mao H, Yang X, Dally WJ (2019) A delay metric for video object detection: what average precision fails to tell. In: Proceedings of the international conference on computer vision, pp 573–582

  • Nanda SK, Ghai D, Ingole P, Pande S (2022) Soft computing techniques-based digital video forensics for fraud medical anomaly detection. Computer Assisted Methods in Engineering and Science

  • Nanda SK, Ghai D, Pande S (2022) VGG-16-based framework for identification of facemask using video forensics. In: Proceedings of data analytics and management, pp 673–685

  • Nie X, Yang M, Liu RW (2019) Deep neural network-based robust ship detection under different weather conditions. In: Proceedings of the IEEE Intelligent Transportation Systems Conference, pp 47–52

  • Oksuz K, Cam BC, Akbas E, Kalkan S (2018) Localization recall precision (LRP): a new performance metric for object detection. In: Proceedings of the European conference on computer vision, pp 504–519

  • Oliva A, Torralba A (2007) The role of context in object recognition. Trends Cogn Sci 11(12):520–527

    Google Scholar 

  • Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830

  • Patino L, Cane T, Vallee A, Ferryman J (2016) Pets 2016: Dataset and challenge. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1–8

  • Pato LV, Negrinho R, Aguiar PM (2020) Seeing without looking: Contextual rescoring of object detections for ap maximization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 14610–14618

  • Prasad DK, Rajan D, Rachmawati L, Rajabally E, Quek C (2017) Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey. IEEE Trans Intell Transpl Syst 18(8):1993–2016

    Google Scholar 

  • Prasad DK, Prasath CK, Rajan D, Rachmawati L, Rajabally E, Quek C (2018) Object detection in a maritime environment: performance evaluation of background subtraction methods. IEEE Trans Intell Transpl Syst 20(5):1787–1802

    Google Scholar 

  • Prasad DK, Dong H, Rajan D, Quek C (2019) Are object detection assessment criteria ready for maritime computer vision? IEEE Trans Intell Transpl Syst 21(12):5295–5304

    Google Scholar 

  • Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: Imagenet classification using binary convolutional neural networks. In: Proceedings of the European Conference on Computer Vision, pp 525–542

  • Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  • Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767

  • Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  • Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99

    Google Scholar 

  • Ribeiro R, Cruz G, Matos J, Bernardino A (2017) A data set for airborne maritime surveillance environments. IEEE Trans Circuits Syst Video Technol 29(9):2720–2732

    Google Scholar 

  • Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252

    MathSciNet  Google Scholar 

  • Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

  • Shao Z, Wu W, Wang Z, Du W, Li C (2018) Seaships: a large-scale precisely annotated dataset for ship detection. IEEE Trans Multimedia 20(10):2593–2604

    Google Scholar 

  • Shao Z, Wang L, Wang Z, Du W, Wu W (2019) Saliency-aware convolution neural network for ship detection in surveillance video. IEEE Trans Circ Syst Video Technol 30(3):781–794

    Google Scholar 

  • Shao Z, Wang J, Deng L, Huang X, Lu T, Zhang R, Lv X, Ding Q, Wang Z (2021) Glsd: The global large-scale ship database and baseline evaluations. arXiv:2106.02773

  • Shao Z, Zhang X, Zhang T, Xu X, Zeng T (2022) Rbfa-net: a rotated balanced feature-aligned network for rotated SAR ship detection and classification. Remote Sens 14(14):3345–3367

    Google Scholar 

  • Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

    Google Scholar 

  • Shrivastava A, Sukthankar R, Malik J, Gupta A (2016) Beyond skip connections: Top-down modulation for object detection. arXiv:1612.06851

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  • Spagnolo P, Filieri F, Distante C, Mazzeo PL, D’Ambrosio P (2019) A new annotated dataset for boat detection and re-identification. In: Proceedings of the IEEE international conference on advanced video and signal based surveillance, pp 1–7

  • Sun Z, Cao S, Yang Y, Kitani KM (2021) Rethinking transformer-based set prediction for object detection. In: Proceedings of the International Conference on Computer Vision, pp 3611–3620

  • Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) A survey on deep transfer learning. In: Proceedings of the International Conference on Artificial Neural Networks, pp 270–279

  • Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the international conference on computer vision, pp 9627–9636

  • Van de Sande KE, Uijlings JR, Gevers T, Smeulders AW (2011) Segmentation as selective search for object recognition. In: Proceedings of the international conference on computer vision, pp 1879–1886

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:6000–6010

    Google Scholar 

  • Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. Proc IEEE Conf Comput Vis Pattern Recognit 1:905–910

    Google Scholar 

  • Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154

    Google Scholar 

  • Wang RJ, Li X, Ling CX (2018) Pelee: A real-time object detection system on mobile devices. Adv Neural Inf Process Syst 31:1967–1976

    Google Scholar 

  • Wang W, Zheng VW, Yu H, Miao C (2019) A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology 10(2):1–37

    Google Scholar 

  • Wang Y, Wang C, Zhang H, Dong Y, Wei S (2019) A SAR dataset of ship detection for deep learning under complex backgrounds. Remote Sens 11(7):765–778

    Google Scholar 

  • Wang Y, Wang C, Zhang H, Dong Y, Wei S (2019) Automatic ship detection based on retinanet using multi-resolution gaofen-3 imagery. Remote Sens 11(5):531–544

    Google Scholar 

  • Wang W, Xie E, Li X, Fan D.-P, Song K, Liang D, Lu T, Luo P, Shao L (2021) Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the International Conference on Computer Vision, pp 568–578

  • Wei S, Zeng X, Qu Q, Wang M, Su H, Shi J (2020) HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access 8:120234–120254

    Google Scholar 

  • Woo S, Hwang S, Kweon IS (2018) Stairnet: Top-down semantic aggregation for accurate one shot detection. In: Proceedings of the Winter Conference on Applications of Computer Vision, pp 1093–1102

  • Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19

  • Xia R, Chen J, Huang Z, Wan H, Wu B, Sun L, Yao B, Xiang H, Xing M (2022) Crtranssar: a visual transformer based on contextual joint representation learning for SAR ship detection. Remote Sens 14(6):1488–1514

    Google Scholar 

  • Xian Y, Lampert CH, Schiele B, Akata Z (2018) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265

    Google Scholar 

  • Xie L, Wang J, Wei Z, Wang M, Tian Q (2016) Disturblabel: Regularizing cnn on the loss layer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4753–4762

  • Xie Q, Dai Z, Hovy E, Luong T, Le Q (2020) Unsupervised data augmentation for consistency training. Adv Neural Inf Process Syst 33:6256–6268

    Google Scholar 

  • Xu H, Jiang C, Liang X, Lin L, Li Z (2019) Reasoning-rcnn: Unifying adaptive global reasoning into large-scale object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6419–6428

  • Xu X, Zhang X, Zhang T (2022) Lite-yolov5: a lightweight deep learning detector for on-board ship detection in large-scene sentinel-1 sar images. Remote Sens 14(4):1018–1044

    Google Scholar 

  • Yadav N, Alfayeed SM, Khamparia A, Pandey B, Thanh DN, Pande S (2022) HSV model-based segmentation driven facial acne detection using deep learning. Expert Syst 39(3):12760

    Google Scholar 

  • Yao Z, Ai J, Li B, Zhang C (2021) Efficient detr: Improving end-to-end object detector with dense prior. arXiv:2104.01318

  • Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122

  • Zagoruyko S, Lerer A, Lin T-Y, Pinheiro PO, Gross S, Chintala S, Dollár P (2016) A multipath network for object detection. arXiv:1604.02135

  • Zeng X, Ouyang W, Yang B, Yan J, Wang X (2016) Gated bi-directional cnn for object detection. In: Proceedings of the European Conference on Computer Vision, pp 354–369

  • Zeng X, Ouyang W, Yan J, Li H, Xiao T, Wang K, Liu Y, Zhou Y, Yang B, Wang Z et al (2017) Crafting gbd-net for object detection. IEEE Trans Pattern Anal Mach Intell 40(9):2109–2123

    Google Scholar 

  • Zhang T, Zhang X, Ke X, Zhan X, Shi J, Wei S, Pan D, Li J, Su H, Zhou Y et al (2020) Ls-ssdd-v1.0: a deep learning dataset dedicated to small ship detection from large-scale sentinel-1 SAR images. Remote Sens 12(18):2997–3033

    Google Scholar 

  • Zhang Z, Zhang L, Wang Y, Feng P, He R (2021) Shiprsimagenet: a large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images. IEEE J Select Top Appl Earth Observ Remote Sens 14:8458–8472

    Google Scholar 

  • Zhang Z, Lu X, Cao G, Yang Y, Jiao L, Liu F (2021) Vit-yolo: transformer-based yolo for object detection. In: Proceedings of the international conference on computer vision, pp 2799–2808

  • Zhang H, Wang N (2016) On the stability of video detection and tracking. arXiv:1611.06467

  • Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4203–4212

  • Zheng L, Fu C, Zhao Y (2018) Extend the shallow part of single shot multibox detector via convolutional neural network. Proceedings of the Tenth International Conference on Digital Image Processing 10806:287–293

    Google Scholar 

  • Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IOU loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 34:12993–13000

    Google Scholar 

  • Zheng M, Gao P, Zhang R, Li K, Wang X, Li H, Dong H (2020) End-to-end object detection with adaptive clustering transformer. arXiv:2011.09315

  • Zheng Y, Zhang S (2020) Mcships: a large-scale ship dataset for detection and fine-grained categorization in the wild. In: Proceeding of the IEEE international conference on multimedia and expo, pp 1–6

  • Zhou P, Ni B, Geng C, Hu J, Xu Y (2018) Scale-transferrable object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 528–537

  • Zhou K, Zhang M, Wang H, Tan J (2022) Ship detection in sar images based on multi-scale feature extraction and adaptive feature fusion. Remote Sens 14(3):755–772

    Google Scholar 

  • Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, Lu H (2017) Couplenet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4126–4134

  • Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 840–849

  • Zhu C, Chen F, Shen Z, Savvides M (2020) Soft anchor-point object detection. In: Proceedings of the European conference on computer vision, pp 91–107

  • Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: deformable transformers for end-to-end object detection. arXiv:2010.04159

  • Zhu M, Hu G, Zhou H, Wang S, Feng Z, Yue S (2022) A ship detection method via redesigned fcos in large-scale sar images. Remote Sens 14(5):1153–1170

    Google Scholar 

  • Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2020) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76

    Google Scholar 

  • Zoph B, Cubuk E.D, Ghiasi G, Lin T.-Y, Shlens J, Le QV (2020) Learning data augmentation strategies for object detection. In: Proceedings of the European Conference on Computer Vision, pp 566–583

Download references

Acknowledgements

The authors would like to acknowledge the support of Fundamental Research Funds for the Central Universities (Grant No. 3132019344) and Leading Scholar Grant, Dalian Maritime University (Grant No. 00253007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Joo Er.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Er, M.J., Zhang, Y., Chen, J. et al. Ship detection with deep learning: a survey. Artif Intell Rev 56, 11825–11865 (2023). https://doi.org/10.1007/s10462-023-10455-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-023-10455-x

Keywords

Navigation