Abstract
With the continuous progress of UAV (unmanned aerial vehicle) flight technology, more and more outdoor vision tasks begin to rely on UAV to complete, many of which require computer vision algorithms to analyze the information captured by the camera. However, it is difficult to deploy detectors on embedded devices due to the challenges among energy consumption, accuracy, and speed. In this paper, we propose an end-to-end object detection model running on a UAV platform that is suitable for real-time applications. Through the research of shufflenetv2 and mobilenetv3, a new feature extraction network structure is proposed. In order to improve the detection accuracy without losing the detection efficiency, a multi-scale fusion module based on deconvolution is added. Experiments show when deployed on our onboard Nvidia Jetson TX2 for testing and inference, our model combined with a modified focal loss function, produced a desirable performance of 21.7% mAP for object detection with an inference time of 17 fps.
Similar content being viewed by others
References
Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934
Energy-Efficient Real-Time UAV (2020) Object detection on embedded platforms[J]. IEEE Trans Comput-Aided Des Integr Circuits Syst 39(10):3123–3127
Feng XY, Mei W, Hu D (2018) Aerial target detection based on improved faster R-CNN[J]. Acta Optica Sinica 38(6):0615004
Girshick R, Donahue J, Darrell T, et al. (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.: 580–587
Howard AG, Zhu M, Chen B et al (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861
Howard A, Sandler M, Chu G et al (2019) Searching for mobilenetv3[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1314–1324
Tan XC, Wang ZH (2018) Faster R-CNN deep learning network based object recognition of remote sensing image[J]. J Geo-Inf Sci 20(10):1500–1508
Kyrkou C, Plastiras G, Theocharides T et al (2018) DroNet: Efficient convolutional neural network detector for real-time UAV applications[C]. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, pp 967–972
Li J, Dai Y, Li C et al (2018) Visual detail augmented mapping for small aerial target detection[J]. Remote Sens 11(1):14
Li Y, Li J, Lin W et al (2018) Tiny-DSOD: Lightweight object detection for resource-restricted usages[J]. arXiv preprint arXiv:1807.11013
Lin TY, Dollár P, Girshick R et al (2017) Feature pyramid networks for object detection[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection[C]. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Cheng-Yang F, Berg AC (2016) SSD: Single shot multibox detector. Proceed European Conf Comput Vis (ECCV) 2(11):21–37
Ma N, Zhang X, Zheng HT et al (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]. In: Proceedings of the European conference on computer vision (ECCV), pp 116–131
Peng C, Zhang X, Yu G et al (2017) Large kernel matters--improve semantic segmentation by global convolutional network[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4353–4361
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement[J]. arXiv e-prints
Redmon J, Divvala S, Girshick R, et al. (2015) You only look once: unified, Real-Time object detection[J]
Ren, S, He, K, Girshick, R, Sun, J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497
Ren S, He K, Girshick R et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Adv Neural Inf Process Syst 2015:28
Sandler M, Howard A, Zhu M et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Sermanet P, Eigen D, Zhang X et al (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks[J]. arXiv preprint arXiv:1312.6229
Singh PP, Prasad S, Chaudhary AK et al (2019) Classification of effusion and cartilage erosion affects in osteoarthritis knee MRI images using deep learning model[C]. In: International Conference on Computer Vision and Image Processing. Springer, Singapore, pp 373–383
Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks[C]. International conference on machine learning. PMLR 2019:6105–6114
Tian Z, Shen C, Chen H et al (2019) Fcos: Fully convolutional one-stage object detection[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
Vaddi S (2019) Efficient object detection model for real-time UAV applications[D]. Iowa State University
Wang RJ, Li X, Ling CX (2018) Pelee: A real-time object detection system on mobile devices[J]. Adv Neural Inf Process Syst 2018:31
Zhang X, Zhou X, Lin M et al (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Zhang P, Zhong Y, Li X (2019) SlimYOLOv3: Narrower, faster and better for real-time UAV applications[C]. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops
Zhu P, Wen L, Du D et al (2020) Vision meets drones: Past, present and future[J]. https://doi.org/10.48550/arXiv.2001.06303
Funding
This work was supported by Natural Science Foundation of Shandong Province (ZR2021MD057) and Natural Science Foundation of Shandong Province (ZR2020KE023). Jin Han is thankful for the financial support from the China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, Y., Han, J. Real-Time object detector based MobileNetV3 for UAV applications. Multimed Tools Appl 82, 18709–18725 (2023). https://doi.org/10.1007/s11042-022-14196-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-14196-x