Skip to main content

Advertisement

Log in

Real-Time object detector based MobileNetV3 for UAV applications

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the continuous progress of UAV (unmanned aerial vehicle) flight technology, more and more outdoor vision tasks begin to rely on UAV to complete, many of which require computer vision algorithms to analyze the information captured by the camera. However, it is difficult to deploy detectors on embedded devices due to the challenges among energy consumption, accuracy, and speed. In this paper, we propose an end-to-end object detection model running on a UAV platform that is suitable for real-time applications. Through the research of shufflenetv2 and mobilenetv3, a new feature extraction network structure is proposed. In order to improve the detection accuracy without losing the detection efficiency, a multi-scale fusion module based on deconvolution is added. Experiments show when deployed on our onboard Nvidia Jetson TX2 for testing and inference, our model combined with a modified focal loss function, produced a desirable performance of 21.7% mAP for object detection with an inference time of 17 fps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934

  2. Energy-Efficient Real-Time UAV (2020) Object detection on embedded platforms[J]. IEEE Trans Comput-Aided Des Integr Circuits Syst 39(10):3123–3127

    Article  Google Scholar 

  3. Feng XY, Mei W, Hu D (2018) Aerial target detection based on improved faster R-CNN[J]. Acta Optica Sinica 38(6):0615004

  4. Girshick R, Donahue J, Darrell T, et al. (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.: 580–587

  5. Howard AG, Zhu M, Chen B et al (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861

  6. Howard A, Sandler M, Chu G et al (2019) Searching for mobilenetv3[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1314–1324

  7. Tan XC, Wang ZH (2018) Faster R-CNN deep learning network based object recognition of remote sensing image[J]. J Geo-Inf Sci 20(10):1500–1508

  8. Kyrkou C, Plastiras G, Theocharides T et al (2018) DroNet: Efficient convolutional neural network detector for real-time UAV applications[C]. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, pp 967–972

  9. Li J, Dai Y, Li C et al (2018) Visual detail augmented mapping for small aerial target detection[J]. Remote Sens 11(1):14

  10. Li Y, Li J, Lin W et al (2018) Tiny-DSOD: Lightweight object detection for resource-restricted usages[J]. arXiv preprint arXiv:1807.11013

  11. Lin TY, Dollár P, Girshick R et al (2017) Feature pyramid networks for object detection[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  12. Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection[C]. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  13. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Cheng-Yang F, Berg AC (2016) SSD: Single shot multibox detector. Proceed European Conf Comput Vis (ECCV) 2(11):21–37

    Google Scholar 

  14. Ma N, Zhang X, Zheng HT et al (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]. In: Proceedings of the European conference on computer vision (ECCV), pp 116–131

  15. Peng C, Zhang X, Yu G et al (2017) Large kernel matters--improve semantic segmentation by global convolutional network[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4353–4361

  16. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  17. Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement[J]. arXiv e-prints

  18. Redmon J, Divvala S, Girshick R, et al. (2015) You only look once: unified, Real-Time object detection[J]

  19. Ren, S, He, K, Girshick, R, Sun, J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497

  20. Ren S, He K, Girshick R et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Adv Neural Inf Process Syst 2015:28

  21. Sandler M, Howard A, Zhu M et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

  22. Sermanet P, Eigen D, Zhang X et al (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks[J]. arXiv preprint arXiv:1312.6229

  23. Singh PP, Prasad S, Chaudhary AK et al (2019) Classification of effusion and cartilage erosion affects in osteoarthritis knee MRI images using deep learning model[C]. In: International Conference on Computer Vision and Image Processing. Springer, Singapore, pp 373–383

  24. Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks[C]. International conference on machine learning. PMLR 2019:6105–6114

  25. Tian Z, Shen C, Chen H et al (2019) Fcos: Fully convolutional one-stage object detection[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636

  26. Vaddi S (2019) Efficient object detection model for real-time UAV applications[D]. Iowa State University

  27. Wang RJ, Li X, Ling CX (2018) Pelee: A real-time object detection system on mobile devices[J]. Adv Neural Inf Process Syst 2018:31

  28. Zhang X, Zhou X, Lin M et al (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856

  29. Zhang P, Zhong Y, Li X (2019) SlimYOLOv3: Narrower, faster and better for real-time UAV applications[C]. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops

  30. Zhu P, Wen L, Du D et al (2020) Vision meets drones: Past, present and future[J]. https://doi.org/10.48550/arXiv.2001.06303

Download references

Funding

This work was supported by Natural Science Foundation of Shandong Province (ZR2021MD057) and Natural Science Foundation of Shandong Province (ZR2020KE023). Jin Han is thankful for the financial support from the China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Han.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Han, J. Real-Time object detector based MobileNetV3 for UAV applications. Multimed Tools Appl 82, 18709–18725 (2023). https://doi.org/10.1007/s11042-022-14196-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-14196-x

Keywords

Navigation