Pas: a scale-invariant approach to maritime search and rescue object detection using preprocessing and attention scaling

Li, Shibao; Li, Chen; Wang, Zhaoyu; Jia, Zekun; Zhu, Jinze; Cui, Xuerong; Liu, Jianhang

doi:10.1007/s11370-024-00526-5

Pas: a scale-invariant approach to maritime search and rescue object detection using preprocessing and attention scaling

Original Research Paper
Published: 02 March 2024

(2024)
Cite this article

Intelligent Service Robotics Aims and scope Submit manuscript

Shibao Li ORCID: orcid.org/0000-0002-3924-9001¹,
Chen Li¹,
Zhaoyu Wang¹,
Zekun Jia¹,
Jinze Zhu¹,
Xuerong Cui¹ &
…
Jianhang Liu²

265 Accesses
Explore all metrics

Abstract

Object detection is a primary means of unmanned aerial vehicle (UAV) maritime search and rescue. The problem of scale variation caused by UAV flight height changes, shooting angle changes, and giant waves seriously affects the detection performance. However, most work does not explicitly consider the effects of these factors. In this work, we propose an algorithm called Preprocessing and Attention Scaling, which explicitly considers the scale variation problem caused by height, angle changes, and giant waves for the first time and solves it through Preprocessing Scaling and Attention Scaling. The Preprocessing Scaling module scales and perspective changes the images according to each photograph’s recorded flight altitude and shooting angle and crops them to the appropriate size, significantly improving the detection accuracy and shortening the inference time. At the same time, the scale variation caused by the up and down of the object due to the vast swells cannot be solved by the Preprocessing Scaling module, so we designed the Attention Scaling module again to quickly capture the area that needs further scale change by fusing the horizontal attention and vertical attention, and then transform it to the appropriate scale by the affine transformation, further improving detection accuracy. We extensively tested PAS on the well-known SeaDronesSee-DET and the SeaDronesSee-DET v2 (S-ODv2) datasets, significantly improving the detection accuracy. In addition, we successfully tested our method on a height-angle transfer task, where we trained on some height-angle intervals and tested on different height-angle intervals, achieving good results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images

Article 28 September 2023

Multi-level Attention Network with Weather Suppression for All-Weather Action Detection in UAV Rescue Scenarios

YOLO-MSFR: real-time natural disaster victim detection based on improved YOLOv5 network

Article 08 December 2023

References

Gasienica-Józkowy J, Knapik M, Cyganek B (2021) An ensemble deep learning method with optimized weights for drone-based water rescue and surveillance. Integr Comput Aided Eng 28(4):1–15
Google Scholar
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: Exceeding yolo series in 2021
Wang C-Y, Bochkovskiy A, Liao H-Y (2022) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv e-prints
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Patt Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/tpami.2016.2577031
Article Google Scholar
Zhou X, Koltun V, Krähenbühl P (2021) Probabilistic two-stage detection. Cornell University - arXiv
Varga L, Kiefer B, Messmer M, Zell A (2021) Seadronessee: A maritime benchmark for detecting humans in open water. arXiv: Computer Vision and Pattern Recognition
Zhang Y, Guo L, Wang Z, Xu F (2020) Intelligent ship detection in remote sensing images based on multi-layer convolutional feature fusion. Remote Sens 12(20):3316
Article ADS Google Scholar
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr42600.2020.01079
Huang H, Huo C, Wei F, Pan C (2019) Rotation and scale-invariant object detector for high resolution optical remote sensing images. In: IGARSS 2019 - 2019 IEEE international geoscience and remote sensing symposium. https://doi.org/10.1109/igarss.2019.8898495
Wang C-Y, Mark Liao H-Y, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) Cspnet: A new backbone that can enhance learning capability of cnn. In: 2020 IEEE/cvf conference on computer vision and pattern recognition workshops (CVPRW). https://doi.org/10.1109/cvprw50498.2020.00203
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. Cornell University - arXiv
Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr.2019.00720
Xu J, Li Y, Wang S (2021) Adazoom: Adaptive zoom network for multi-scale object detection in large scenes. Cornell University - arXiv
Liao J, Piao Y, Su J, Cai G, Huang X, Chen L, Huang Z, Wu Y (2021) Unsupervised cluster guided object detection in aerial images. IEEE J Select Top Appl Earth Observ Remote Sens 14:11204–11216. https://doi.org/10.1109/jstars.2021.3122152
Article ADS Google Scholar
Li C, Yang T, Zhu S, Chen C, Guan S (2020) Density map guided object detection in aerial images. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW). https://doi.org/10.1109/cvprw50498.2020.00103
Wu Z, Suresh K, Narayanan P, Xu H, Kwon H, Wang Z (2019) Delving into robust object detection from unmanned aerial vehicles: a deep nuisance disentanglement approach. Cornell University - arXiv
Messmer M, Kiefer B, Zell A (2022) Gaining scale invariance in uav bird’s eye view object detection by adaptive resizing. In: 2022 26th international conference on pattern recognition (ICPR). https://doi.org/10.1109/icpr56361.2022.9956122
Kim S, Heo WY, Sung H, Yoon D, Jeong J (2020) Height-adaptive vehicle detection in aerial imagery using metadata of eo sensor. In: automatic target recognition XXX. https://doi.org/10.1117/12.2554376
Li Y, Chen Y, Wang N, Zhang Z-X (2019) Scale-aware trident networks for object detection. In: 2019 IEEE/CVF international conference on computer vision (ICCV). https://doi.org/10.1109/iccv.2019.00615
Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S. (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr.2017.106
Kiefer B, Kristan M, Perš J, Žust L, Poiesi F, Andrade F, Bernardino A, Dawkins M., Raitoharju J, Quan Y, Atmaca A, Höfer T, Zhang Q, Xu Y, Zhang J, Tao D, Sommer L, Spraul R, Zhao H, Zhang H, Zhao Y, Augustin J, Jeon E-i, Lee I, Zedda L, Loddo A, Ruberto C, Verma S, Gupta S, Muralidhara S, Hegde N, Xing D, Evangeliou N, Tzes A, Bartl V, Špaňhel J, Herout A, Bhowmik N, Breckon T, Kundargi S, Anvekar T, Desai C, Tabib R, Mudengudi U, Vats A, Song Y, Liu D, Li Y, Li S, Tan C, Lan L, Somers V, Vleeschouwer C, Alahi A, Huang H-W, Yang C-Y, Hwang J-N, Kim P-K, Kim K, Lee K, Jiang S, Li H, Ziqiang Z, Vu T-A, Nguyen-Truong H, Yeung S-K, Jia Z, Yang S, Hsu C-C, Hou X-Y, Jhang Y-A, Yang S, Yang M-T (2022) 1st workshop on maritime computer vision (macvi) 2023: Challenge results
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: Deformable transformers for end-to-end object detection. arXiv: Computer Vision and Pattern Recognition
Zhou X, Wang D, Krähenbühl P (2019) Objects as points. Vision and Pattern Recognition arXiv: Computer
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: 2019 IEEE/CVF international conference on computer vision (ICCV). https://doi.org/10.1109/iccv.2019.00972
Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: 2017 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/iccv.2017.324
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra r-cnn: Towards balanced learning for object detection. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR) . https://doi.org/10.1109/cvpr.2019.00091
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers, pp. 213–229. https://doi.org/10.1007/978-3-030-58452-8_13
Cafarelli D, Ciampi L, Vadicamo L, Gennaro C, Berton A, Paterni M, Benvenuti C, Passera M, Falchi F (2022) Mobdrone: a drone video dataset for man overboard rescue
Kiefer B, Ott D, Zell A (2021) Leveraging synthetic data in object detection on unmanned aerial vehicles
Moeyersons J, Verhoeve B, Maenhaut P-J, Volckaert B, De Turck F (2019) Pluggable drone imaging analysis framework for mob detection during open-air events. In: proceedings of the 8th international conference on pattern recognition applications and methods. https://doi.org/10.5220/0007260400640072
Rizk M, Slim F, Baghdadi A, Diguet J-P (2023) Towards real-time human detection in maritime environment using embedded deep learning, pp. 583–593. https://doi.org/10.1007/978-3-031-16281-7_55
Vasilopoulos E, Vosinakis G, Krommyda M, Karagiannidis L, Ouzounoglou E, Amditis A (2022) Autonomous object detection using a uav platform in the maritime environment
Goncalves L, Damas B (2022) Automatic detection of rescue targets in maritime search and rescue missions using uavs. In: 2022 international conference on unmanned aircraft systems (ICUAS). https://doi.org/10.1109/icuas54217.2022.9836137
Zheng R, Yang R, Lu K, Zhang S (2019) A search and rescue system for maritime personnel in disaster carried on unmanned aerial vehicle. In: 2019 18th international symposium on distributed computing and applications for business engineering and science (DCABES)
Bai J, Dai J, Wang Z, Yang S (2022) A detection method of the rescue targets in the marine casualty based on improved yolov5s. Front Neurorob 16:1053. https://doi.org/10.3389/fnbot.2022.1053124
Article Google Scholar
Feraru VA, Andersen RE, Boukas E (2020) Towards an autonomous uav-based system to assist search and rescue operations in man overboard incidents. In: 2020 IEEE international symposium on safety, security, and rescue robotics (SSRR). https://doi.org/10.1109/ssrr50563.2020.9292632
Zhu X, Lyu S, Wang X, Zhao Q (2021) Tph-yolov5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. cornell university - arXiv
Woo S, Park J, Lee J-Y, Kweon I (2018) Cbam: convolutional block attention module. Vision and Pattern Recognition arXiv: Computer
Wu Y, Zhang K, Wang J, Wang Y, Wang Q, Li Q (2022) Cdd-net: a context-driven detection network for multiclass object detection. IEEE Geosci Remote Sens Lett 19:1–5. https://doi.org/10.1109/lgrs.2020.3042465
Zhang Z, Lu X, Cao G, Yang Y, Jiao L, Liu F (2021) Vit-yolo:transformer-based yolo for object detection. In: 2021 IEEE/CVF international conference on computer vision workshops (ICCVW). https://doi.org/10.1109/iccvw54120.2021.00314
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I (2017) Attention is all you need. Neural Inform Process Syst 30:105
Google Scholar
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Patt Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/tpami.2019.2913372
Article Google Scholar
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. Neural Inform Process Syst 28:501
Google Scholar
Nair V, Hinton G (2010) Rectified linear units improve restricted boltzmann machines. International conference on machine learning,international conference on machine learning
Kingma D, Ba J (2014) Adam: A method for stochastic optimization. arXiv: Learning,arXiv: Learning
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. Cornell University - arXiv

Download references

Funding

Funding was provided by National Natural Science Foundation of China, Grant No. 61972417 and Natural Science Foundation of Shandong Province Grant No. ZR2020MF005.

Author information

Authors and Affiliations

College of Ocean and Spatial Information, China University of Petroleum (East China), 66 Changjiang West Road, Huangdao District, Qingdao city, 266580, Shandong Province, China
Shibao Li, Chen Li, Zhaoyu Wang, Zekun Jia, Jinze Zhu & Xuerong Cui
College of Computer Science and Technology, China University of Petroleum (East China), 66 Changjiang West Road, Huangdao District, Qingdao city, 266580, Shandong Province, China
Jianhang Liu

Authors

Shibao Li
View author publications
You can also search for this author in PubMed Google Scholar
Chen Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zekun Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jinze Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xuerong Cui
View author publications
You can also search for this author in PubMed Google Scholar
Jianhang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shibao Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, S., Li, C., Wang, Z. et al. Pas: a scale-invariant approach to maritime search and rescue object detection using preprocessing and attention scaling. Intel Serv Robotics (2024). https://doi.org/10.1007/s11370-024-00526-5

Download citation

Received: 19 June 2023
Accepted: 01 February 2024
Published: 02 March 2024
DOI: https://doi.org/10.1007/s11370-024-00526-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pas: a scale-invariant approach to maritime search and rescue object detection using preprocessing and attention scaling

Abstract

Access this article

Similar content being viewed by others

DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images

Multi-level Attention Network with Weather Suppression for All-Weather Action Detection in UAV Rescue Scenarios

YOLO-MSFR: real-time natural disaster victim detection based on improved YOLOv5 network

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Pas: a scale-invariant approach to maritime search and rescue object detection using preprocessing and attention scaling

Abstract

Access this article

Similar content being viewed by others

DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images

Multi-level Attention Network with Weather Suppression for All-Weather Action Detection in UAV Rescue Scenarios

YOLO-MSFR: real-time natural disaster victim detection based on improved YOLOv5 network

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation