Skip to main content

Advertisement

Log in

Multiple attentional path aggregation network for marine object detection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Marine target detection is a challenging task because degraded underwater images cause unclear targets. Furthermore, marine targets are small in size and tend to live together. The popular object detection methods perform poorly in marine target detection. Thus, this paper proposes a novel multiple attentional path aggregation network named APAN to improve performance on marine object detection. Firstly, we design a path aggregation network structure which brings features from backbone network to bottom-up path augmentation. Each feature map is enhanced by the lower layer through the bottom-up downsampling pathway and incorporates the features from top-down upsampling layers. Specifically, the last layer fuses feature map from backbone network which enhances the semantic features and improve the ability of feature extraction. Then, a multi-attention which combines coordinate competing attention and spatial supplement attention applies to proposed path aggregation network. Multi-attention can further improve the accuracy of multiple marine object detection. Finally, a double transmission underwater image enhancement algorithm is proposed to enhance the underwater image datasets. The experiments show our method achieves 79.6% mAP in underwater image datasets and 79.03% mAP in enhanced underwater image datasets. Meanwhile, our method achieves 81.5% mAP in PASCAL VOC datasets. In addition, we also applly the method to the underwater robot. The experiments show our method achieves good performance compared with popular object detection methods. The source code is publicly available at https://github.com/yhf2022/APAN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Salvi M, Acharya U R, Molinari F, Meiburger K M (2021) The impact of pre-and post-image processing techniques on deep learning frameworks: a comprehensive review for digital pathology image analysis. Comput Biol Med 128:104129

    Article  Google Scholar 

  2. Ren S, He K, Girshick R, Sun J (2016) Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  3. Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  4. Xu F, Wang H, Peng J, Fu X (2021) Scale-aware feature pyramid architecture for marine object detection. Neural Comput Appl 33(8):3637–3653

    Article  Google Scholar 

  5. Tian Y, Yang G, Wang Z, Wang H, Li E, Liang Z (2019) Apple detection during different growth stages in orchards using the improved YOLO-v3 model. Comput Electron Agricul 157:417–426

    Article  Google Scholar 

  6. Mittal P, Singh R, Sharma A (2020) Deep learning-based object detection in low-altitude UAV datasets: a survey. Image Vis Comput 104:104046

    Article  Google Scholar 

  7. Chen L, Zhang Z, Peng L (2018) Fast single shot multibox detector and its application on vehicle counting system. IET Intell Transp Syst 12(10):1406–1413

    Article  Google Scholar 

  8. Zhao Z Q, Zheng P, Xu S T, Wu X (2019) Object detection with deep learning: a review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232

    Article  Google Scholar 

  9. Lin T Y, Goyal P, Girshick R, He K, Dollar P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327

    Article  Google Scholar 

  10. Wang N, Wang Y, Er MJ (2022) Review on deep learning techniques for marine object recognition: Architectures and algorithms. Control Eng Practice 118, 104458

  11. Chen X, Yu J, Kong S, Wu Z, Fang X, Wen L (2019) Towards real-time advancement of underwater visual quality with GAN. IEEE Trans Ind Electron 66(12):9350–9359

    Article  Google Scholar 

  12. Ancuti C O, Ancuti C, De Vleeschouwer C, Bekaert P (2017) Color balance and fusion for underwater image enhancement. IEEE Trans Image Process 27(1):379–393

    Article  MathSciNet  MATH  Google Scholar 

  13. Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129(11):3069–3087

    Article  Google Scholar 

  14. Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikainen M (2020) Deep learning for generic object detection: a survey. Int J Comput Vis 128(2):261–318

    Article  MATH  Google Scholar 

  15. Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307

    Article  Google Scholar 

  16. Zhang H, Wang K, Tian Y, Gou C, Wang F Y (2018) MFR-CNN: Incorporating Multi-scale features and global information for traffic object detection. IEEE Trans Veh Technol 67(9):8019–8030

    Article  Google Scholar 

  17. Xu Y, Wen G, Hu Y, Luo M, Dai D, Zhuang Y, Hall W (2021) Multiple attentional pyramid networks for Chinese herbal recognition. Pattern Recogn 110:107558

    Article  Google Scholar 

  18. Ghiasi G, Fowlkes C C (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In European conference on computer vision. Springer, pp 519–534

  19. Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters–improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4353–4361

  20. Song W, Zheng N, Liu X, Qiu L, Zheng R (2019) An improved u-net convolutional networks for seabed mineral image segmentation. IEEE Access 7:82744–82752

    Article  Google Scholar 

  21. Shelhamer E, Long J, Darrell T (2016) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651

    Article  Google Scholar 

  22. Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  23. Shrivastava A, Sukthankar R, Malik J, Gupta A (2016) Beyond skip connections: Top-down modulation for object detection. arXiv:1612.06851

  24. Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636

  25. Fan D P, Lin Z, Zhang Z, Zhu M, Cheng M M (2020) Rethinking RGB-d salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089

    Article  Google Scholar 

  26. Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868

    Article  Google Scholar 

  27. Bell S, Zitnick C L, Bala K, Girshick R (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883

  28. Han J, Zhang D, Cheng G, Liu N, Xu D (2018) Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Proc Mag 35(1):84–100

    Article  Google Scholar 

  29. Nguyen T V, Zhao Q, Yan S (2018) Attentive systems: a survey. Int J Comput Vis 126 (1):86–110

    Article  Google Scholar 

  30. Li W, Zhu X, Gong S (2020) Scalable person re-identification by harmonious attention. Int J Comput Vis 128(6):1635–1653

    Article  Google Scholar 

  31. Wei S, Qu Q, Wu Y, Wang M, Shi J (2020) PRI Modulation recognition based on squeeze-and-excitation networks. IEEE Commun Lett 24(5):1047–1051

    Article  Google Scholar 

  32. Taghanaki S A, Abhishek K, Cohen J P, Cohen-Adad J, Hamarneh G (2021) Deep semantic segmentation of natural and medical images: a review. Artif Intell Rev 54(1):137–178

    Article  Google Scholar 

  33. Arrieta AB, Diaz Rodriguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Herrera F (2020) Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion 58:82–115

    Article  Google Scholar 

  34. Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516

    Article  Google Scholar 

  35. Ouyang X, Huo J, Xia L, Shan F, Liu J, Mo Z, Shen D (2020) Dual-sampling attention network for diagnosis of COVID-19 from community acquired pneumonia. IEEE Trans Med Imaging 39(8):2595–2605

    Article  Google Scholar 

  36. Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua TS (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5659–5667

  37. Yu X, Li X, Wu H, Bai Y (2020) DS-NLCSinet: Exploiting non-local neural networks for massive MIMO CSI feedback. IEEE Commun Lett 24(12):2790–2794

    Article  Google Scholar 

  38. Du Y, Yuan C, Li B, Zhao L, Li Y, Hu W (2018) Interaction-aware spatio-temporal pyramid attention networks for action classification. In: Proceedings of the European conference on computer vision, pp 373–389

  39. Tang R, Chen L, Zou Y, Lai Z, Albertini M K, Yang X (2021) Lightweight network with one-shot aggregation for image super-resolution. J Real-Time Image Proc 18(4):1275–1284

    Article  Google Scholar 

  40. Long W, Li X, Gao L (2020) A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput Appl 32(10):6111–6124

    Article  Google Scholar 

  41. Xie W, Jiang T, Li Y, Jia X, Lei J (2019) Structure tensor and guided filtering-based algorithm for hyperspectral anomaly detection. IEEE Trans Geosci Remote Sens 57(7):4218– 4230

    Article  Google Scholar 

  42. Peng Y T, Cosman P C (2017) Underwater image restoration based on image blurriness and light absorption. IEEE Trans Image Process 26(4):1579–1594

    Article  MathSciNet  MATH  Google Scholar 

  43. Gao S B, Zhang M, Zhao Q, Zhang X S (2019) Underwater image enhancement using adaptive retinal mechanisms, vol 28

  44. Li C, Guo C, Ren W, Cong R, Hou J, Kwong S, Tao D (2019) An underwater image enhancement benchmark dataset and beyond. IEEE Trans Image Process 29:4376–4389

    Article  MATH  Google Scholar 

  45. Li X, Lei C, Yu H, Feng Y (2022) Underwater image restoration by color compensation and color-line model. Signal Process Image Commun 101:116569

    Article  Google Scholar 

  46. Chen X, Lu Y, Wu Z, Yu J, Wen L (2020) Reveal of domain effect: How visual restoration contributes to object detection in aquatic scenes. arXiv:2003.01913

  47. Wang J, Luo J, Liu B, Feng R, Lu L, Zou H (2020) Automated diabetic retinopathy grading and lesion detection based on the modified r-FCN object detection algorithm. IET Comput Vis 14(1):1–8

    Article  Google Scholar 

  48. Shen Z, Liu Z, Li J, Jiang Y G, Chen Y, Xue X (2019) Object detection from scratch with deep supervision. IEEE Trans Pattern Anal Mach Intell 42(2):398–412

    Article  Google Scholar 

  49. Liu Z, Du J, Tian F, Wen J (2019) MR-CNN: A multi-scale region-based convolutional neural network for small traffic sign recognition. IEEE Access 7:57120–57128

    Article  Google Scholar 

  50. Shen Z, Shi H, Yu J, Phan H, Feris R, Cao L, Savvides M (2017) Improving object detection from scratch via gated feature reuse. arXiv:1712.00886

  51. Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: Exceeding yolo series in 2021. arXiv:2107.08430

  52. Jian M, Qi Q, Yu H et al (2019) The extended marine underwater environment database and baseline evaluations[J]. Appl Soft Comput 80:425–437

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61873224, Grant 62003295, and Grant 41976182, in part by the S&T Program of Hebei under Grant F2020203037, and F2019203031, in part by the Science and Technology Research Projectof Universities in Hebei under Grant QN2020301.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinbin Li.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, H., Li, X., Feng, Y. et al. Multiple attentional path aggregation network for marine object detection. Appl Intell 53, 2434–2451 (2023). https://doi.org/10.1007/s10489-022-03622-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03622-0

Keywords

Navigation