Skip to main content
Log in

Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Recently, significant progress has been made in the research field of unmanned aerial vehicle (UAV) object detection through deep learning. The proliferation of unmanned aerial vehicles has notably facilitated the acquisition of corresponding data. However, the presence of substantial rotated objects in various orientations within UAV data sets poses challenges for traditional horizontal box object detection methods. These conventional approaches struggle to precisely locate rotated objects. Consequently, algorithms for rotated bounding-box object detection have been proposed; however, some of these existing methods exhibit issues, including periodicity of angle and exchangeability of edges. We propose a joint key point representation and rotated distance loss object detection network to solve the above problems. It is mainly composed of the key point representation module and the rotated distance-IoU loss. The key-points representation is used to indirectly represent the angle parameter of the rotated bounding box. It accomplishes this by measuring the angle between the line connecting the center point of the rotated bounding box to a specific boundary center point and the horizontal line. Next, the coordinates of the center points of anchor and the center points of its boundary are used to obtain the height dimension of the rotated bounding box and the width dimension of a rotated bounding box is introduced. Like this, the rotated bounding box can be represented by two points and a width dimension. Also, based on the traditional rotated IoU loss which does not incorporate the distance between the center point of the prediction box and the center point of ground truth in the regression process, the rotated distance-IoU loss is proposed to replace the traditional rotated IoU loss, which speeds up the convergence of the network. We have conducted extensive experiments on the DOTA data set and the DroneVehicle data set and have demonstrated the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 1
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availibility statement

These data were derived from the following resources available in the public domain: DOTA v1.0 and DroneVehicle: https://captain-whu.github.io/DOTA/dataset.htmlhttps://github.com/VisDrone/DroneVehicle

References

  1. Feng, J., Yi, C.: Lightweight Detection Network for Arbitrary-Oriented Vehicles in UAV Imagery via Global Attentive Relation and Multi-Path Fusion. Drones. 6, 108 (2022)

    Article  Google Scholar 

  2. Taheri Tajar, A., Ramazani, A., Mansoorizadeh, M.: A lightweight Tiny-YOLOv3 vehicle detection approach. J. Real-Time Image Proc. 18, 2389–2401 (2021). https://doi.org/10.1007/s11554-021-01131-w

    Article  Google Scholar 

  3. Zerrouk, I., Moumen, Y., Khiati, W.: Evolutionary algorithm for optimized CNN architecture search applied to real-time boat detection in aerial images. J. Real-Time Image Proc. 20, 78 (2023). https://doi.org/10.1007/s11554-023-01332-5

    Article  Google Scholar 

  4. Zeng, T., Fang, J., Yin, C., Li, Y., Fu, W., Zhang, H., Wang, J., Zhang, X.: Recognition of Rubber Tree Powdery Mildew Based on UAV Remote Sensing with Different Spatial Resolutions. Drones. 7, 533 (2023)

    Article  Google Scholar 

  5. Wang, S., Zhao, J., Ta, N., et al.: A real-time deep learning forest fire monitoring algorithm based on an improved Pruned + KD model. J. Real-Time Image Proc. 18, 2319–2329 (2021). https://doi.org/10.1007/s11554-021-01124-9

    Article  Google Scholar 

  6. Marx, A., Chou, Y.-H., Mercy, K., Windisch, R.: A lightweight, robust exploitation system for temporal Stacks of UAS data: use case for forward-deployed military or emergency responders. Drones. 3, 29 (2019)

    Article  Google Scholar 

  7. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 779-788, (2016)

  8. Redmon, J., Farhadi, A.: "YOLO9000: Better, Faster, Stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6517-6525, (2017)

  9. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  10. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)

  11. Zhai, S., Shang, D., Wang, S., Dong, S.: DF-SSD: An improved SSD object detection algorithm based on DenseNet and feature fusion. IEEE Access 8, 24344–24357 (2020)

    Article  Google Scholar 

  12. Girshick, R., Donahue, J., Darrell, T., Malik, J.: "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, pp. 580-587, (2014)

  13. Girshick, R.: "Fast R-CNN," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 1440-1448, (2015)

  14. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Patt. Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  15. Wang, C. Y., Bochkovskiy, A., Liao, H. Y. M.: "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464-7475 (2023)

  16. Talaat, F.M., ZainEldin, H.: An improved fire detection approach based on YOLO-v8 for smart cities. Neural Comput. Appl. 35, 20939–20954 (2023)

    Article  Google Scholar 

  17. Tian, Z., Shen, C., Chen, H., He, T.: "FCOS: Fully Convolutional One-Stage Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 9626-9635, (2019)

  18. Xu, S., Wang, X., Lv, W., et al.: PP-YOLOE: An evolved version of YOLO. arXiv preprint arXiv:2203.16250 (2022)

  19. Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: "RepPoints: Point Set Representation for Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 9656-9665, (2019)

  20. Yang, X., Zhou, Y., Zhang, G., et al.: The KFIoU loss for rotated object detection. arXiv preprint arXiv:2201.12558 (2022)

  21. Han, J., Ding, J., Li, J., et al.: Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2021)

    Google Scholar 

  22. Li, W., Chen, Y., Hu, K., Zhu, J.: "Oriented RepPoints for Aerial Object Detection," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 1819-1828, (2022)

  23. Lyu, C., Zhang, W., Huang, H., et al.: RTMDet: An Empirical Study of Designing Real-Time Object Detectors. arXiv preprint arXiv:2212.07784 (2022)

  24. Yang, X., Yan, J., Feng, Z., et al.: R3det: Refined single-stage detector with feature refinement for rotating object. Proceed. AAAI Conf Artif. Intell. 35(4), 3163–3171 (2021)

    Google Scholar 

  25. Xu, Y., Fu, M., Wang, Q., et al.: Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Patt. Analys. Mach. Intell. 43(4), 1452–1459 (2020)

    Article  Google Scholar 

  26. Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VIII 16. Springer International Publishing, 677-694 (2020)

  27. Yang, X., Yan, J., Ming, Q., et al.: Rethinking rotated object detection with gaussian wasserstein distance loss. International conference on machine learning. PMLR, 11830-11841 (2021)

  28. Yang, X., Yang, X., Yang, J., et al.: Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Adv. Neural. Inf. Process. Syst. 34, 18381–18394 (2021)

    Google Scholar 

  29. Ge, Z., Liu, S., Wang, F., et al.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)

  30. Li, X., Lv, C., Wang, W., et al.: Generalized focal loss: Towards efficient representation learning for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3139–3153 (2022)

    Google Scholar 

  31. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: "Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 658-666, (2019)

  32. Yang, X., Yan, J., Liao, W., et al.: Scrdet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. Trans. Patt. Anal. Mach. Intell. 45(2), 2384–2399 (2022)

    Article  Google Scholar 

  33. Lin, T. -Y., Goyal, P., Girshick, R., He, K., Dollár, P.: "Focal Loss for Dense Object Detection," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp. 2999-3007, (2017)

  34. Zheng, Z., Wang, P., Liu, W., et al.: Distance-IoU loss: Faster and better learning for bounding box regression. Proceed. AAAI Conf. Artif. Intell. 34(07), 12993–13000 (2020)

    Google Scholar 

  35. Zhou, D., et al.: "IoU Loss for 2D/3D Object Detection," 2019 International Conference on 3D Vision (3DV), Quebec City, QC, Canada, pp. 85-94, (2019)

  36. Xia, G. -S., et al.: "DOTA: A Large-Scale Dataset for Object Detection in Aerial Images," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 3974-3983, (2018)

  37. Sun, Y., Cao, B., Zhu, P., et al.: Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning. IEEE Trans. Circuits Syst. Video Technol. 32(10), 6700–6713 (2022)

    Article  Google Scholar 

  38. Xie, X., Cheng, G., Wang, J., et al.: "Oriented R-CNN for object detection," Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 3520-3529, (2021)

Download references

Acknowledgements

This study was funded by Guangdong Basic and Applied Basic Research Foundation (No. 2021A1515011576), Guangdong Science and Technology Planning Project (No. 2021A0505030080, No. 2021A0505060011), Guangdong Higher Education Innovation and Strengthening School Project (No. 2020ZDZX3031, No. 2022ZDZX1032, No. 2023ZDZX1029), Wuyi University Hong Kong and Macao Joint Research and Development Fund (No. 2022WGALH19), Guangdong Jiangmen Science and Technology Research Project (No. 2220002000246, No. 2023760300070008390), Guangdong Science and Technology Innovation Strategy Special Fund (pdjh2022b0528, pdjh2024a374).

Author information

Authors and Affiliations

Authors

Contributions

Methodology, H.Z. and Y.H.; investigation, Y.Z. and H.Z.; data curation, J.Z. and F.D.; validation, Y.X.; writing-original draft preparation, Y.H. and J.Z.; writing-review and editing, Y.Z. and Y.X. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Yikui Zhai.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Huang, Y., Xu, Y. et al. Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss. J Real-Time Image Proc 21, 58 (2024). https://doi.org/10.1007/s11554-024-01444-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-024-01444-6

Keywords

Navigation