Abstract
With the development of aviation technology, UAV technology has gradually become a research hotspot in the aviation field, and it plays a vital role in civil and military construction. Target detection and tracking of aerial video is a key technology for UAV reconnaissance, disaster relief, enemy monitoring, and military strikes. This paper studies the YOLOv3 target detection algorithm and SiamMask target tracking algorithm based on deep learning, and compares these algorithms Based on the situation of long aerial photography distance and small target, the algorithm is improved and fused to realize rapid target identification and tracking under UAV aerial photography. At the same time, in order to simplify the model and accelerate the speed of model training and reasoning, the simplification of the tracking algorithm model is studied. Finally, an aerial photography experiment was conducted in an outdoor environment to verify the effectiveness of the UAV target detection and tracking algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Paszke, A., Gross, S., Massa, F., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp. 8026–8037 (2019)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Wang, Q., Zhang, L., Bertinetto, L., et al.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)
Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines (1998)
Chen, T., He, T., Benesty, M., et al.: XGBoost: extreme gradient boosting. R Package Version 0.4-2, 1–4 (2015)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface (1998)
Babenko, B., Yang, M.H., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2010)
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2011)
Bolme, D.S., Beveridge, J.R., Draper, B.A., et al.: Visual object tracking using adaptive correlation filters. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2544–2550. IEEE (2010)
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, February 2015
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Xu, N., et al.: YouTube-VOS: sequence-to-sequence video object segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 603–619. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_36
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, X., Wang, F., Xu, A., Zhang, G. (2022). UAV Aerial Photography Target Detection and Tracking Based on Deep Learning. In: Proceedings of the 5th China Aeronautical Science and Technology Conference. Lecture Notes in Electrical Engineering, vol 821. Springer, Singapore. https://doi.org/10.1007/978-981-16-7423-5_42
Download citation
DOI: https://doi.org/10.1007/978-981-16-7423-5_42
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7422-8
Online ISBN: 978-981-16-7423-5
eBook Packages: EngineeringEngineering (R0)