Abstract
The analysis of the object detection deep learning model YOLOv5, which was trained on High-altitude Infrared Thermal (HIT) imaging, captured by Unmanned Aerial Vehicles (UAV) is presented. The performance of the several architectures of the YOLOv5 model, specifically ‘n’, ‘s’, ‘m’, ‘l’, and ‘x’, that were trained with the same hyperparameters and data is analyzed. The dependence of some characteristics, like average precision, inference time, and latency time, on different sizes of deep learning models, is investigated and compared for infrared HIT-UAV and standard COCO datasets. The results show that degradation of average precision with the model size is much lower for the HIT-UAV dataset than for the COCO dataset which can be explained that a significant amount of unnecessary information is removed from infrared thermal pictures (“pseudo segmentation”), facilitating better object detection. According to the findings, the significance and value of the research consist in comparing the performance of the various models on the datasets COCO and HIT-UAV, infrared photos are more effective at capturing the real-world characteristics needed to conduct better object detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boccardo, P., Chiabrando, F., Dutto, F., Tonolo, F., Lingua, A.: UAV deployment exercise for mapping purposes: evaluation of emergency response applications. Sensors 15(7), 15717–15737 (2015)
de Castro, A., Torres-Sánchez, J., Peña, J., Jiménez-Brenes, F., Csillik, O., López-Granados, F.: An automatic random forest-OBIA algorithm for early weed mapping between and within crop rows using UAV imagery. Remote Sens. 10(3), 285 (2018)
Kanistras, K., Martins, G., Rutherford, M.J., Valavanis, K.P.: A survey of unmanned aerial vehicles (UAVs) for traffic monitoring. In: 2013 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 221–234 (2013)
Avola, D., Foresti, G.L., Martinel, N., Micheloni, C., Pannone, D., Piciarelli, C.: Aerial video surveillance system for small-scale UAV environment monitoring. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2017)
Liu, Q., Shi, L., Sun, L., Li, J., Ding, M., Shu, F.S.: Path planning for UAV-mounted mobile edge computing with deep reinforcement learning. IEEE Trans. Veh. Technol. 69(5), 5723–5728 (2020)
Wang, F., Zhang, M., Wang, X., Ma, X., Liu, J.: Deep learning for edge computing applications: a state-of-the-art survey. IEEE Access 8, 58322–58336 (2020)
Suo, J., Wang, T., Zhang, X., Chen, H., Zhou, W., Shi, W.: HIT-UAV: a high-altitude infrared thermal dataset for unmanned aerial vehicles (2022)
Shamsoshoara, A.: The FLAME dataset: aerial Imagery Pile burn detection using drones (UAVs) (2020)
Liu, Q., He, Z., Li, X., Zheng, Y.: PTB-TIR: a thermal infrared pedestrian tracking benchmark. IEEE Trans. Multimedia 22(3), 666–675 (2020)
Bondi, E., et al.: BIRDSAI: a dataset for detection and tracking in aerial thermal infrared videos. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1736–1745 (2020)
Beyerer, J., Ruf, M., Herrmann, C.: CNN-based thermal infrared person detection by domain adaptation. In: Dudzik, M.C., Ricklin, J.C. (eds.) Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything,Orlando, USA, p. 8. SPIE (2018)
Levin, E., Zarnowski, A., McCarty, J.L., Bialas, J., Banaszek, A., Banaszek, S.: Feasibility study of inexpensive thermal sensors and small UAS deployment for living human detection in rescue missions application scenarios. Int. Arch. Photogram. Remote Sens. Spat. Inf. Sci. XLI-B8, 99–103 (2016)
Gordienko, Y., et al.: Scaling analysis of specialized tensor processing architectures for deep learning models. Deep Learn. Concepts Archit. 65–99 (2020)
Gordienko, Y., et al.: “Last mile” optimization of edge computing ecosystem with deep learning models and specialized tensor processing architectures. In: Advances in computers, vol. 122, pp. 303–341. Elsevier (2021)
Taran, V., Gordienko, Y., Rokovyi, O., Alienin, O., Kochura, Y., Stirenko, S.: Edge intelligence for medical applications under field conditions. In: Hu, Z., Zhang, Q., Petoukhov, S., He, M. (eds.) ICAILE 2022. LNDECT, vol. 135, pp. 71–80. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04809-8_6
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Sudhakar, S., Vijayakumar, V., Kumar, C.S., Priya, V., Ravi, L., Subramaniyaswamy, V.: Unmanned aerial vehicle (UAV) based forest fire detection and monitoring for reducing false alarms in forest-fires. Comput. Commun. 149, 1–16 (2020)
Bendea, H., Boccardo, P., Dequal, S., Tonolo, F.G., Marenchino, D., Piras, M.: Low cost UAV for post-disaster assessment. Int. Arch. Photogram. Remote Sens. Spat. Inf. Sci. 37, 1373-1379 (2008)
John Gunnar Carlsson and Siyuan Song: Coordinated logistics with a truck and a drone. Manage. Sci. 64(9), 4052–4069 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25. Curran Associates Inc. (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv:1512.03385 [cs], p. 12 (2015)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 [cs], p. 9 (2017)
Girshick, R.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs], p. 14 (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, p. 10. IEEE (2016)
Liu, W., et al.: SSD: single shot multibox detector. arXiv:1512.02325 [cs], 9905:17 (2016)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: YOLOv4: optimal speed and accuracy of object detection (2020)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation (2018)
Li, S., Li, Y., Li, Y., Li, M., Xiaorong, X.: YOLO-FIRI: improved YOLOv5 for infrared image object detection. IEEE Access 9, 141861–141875 (2021)
Kun, Z.: Background noise suppression in small targets infrared images and its method discussion. Opt. Optoelectron. Technol. 2, 9–12 (2004)
Anju, T.S., Nelwin Raj, N.R.: Shearlet transform based image denoising using histogram thresholding. In: 2016 International Conference on Communication Systems and Networks (ComNet), pp. 162–166 (2016)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–154 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893 (2005)
Jocher, G., et al.: Ultralytics/YOLOv5: V7.0 - YOLOv5 SOTA realtime instance segmentation (2022)
Taran, V., et al.: Performance evaluation of deep learning networks for semantic segmentation of traffic stereo-pair images. In: Proceedings of the 19th International Conference on Computer Systems and Technologies, pp. 73–80 (2018)
Taran, V., Gordienko, Y., Rokovyi, A., Alienin, O., Stirenko, S.: Impact of ground truth annotation quality on performance of semantic image segmentation of traffic conditions. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 183–193. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16621-2_17
Acknowledgements
This research was in part sponsored by the NATO Science for Peace and Security Programme under grant id. G6032.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Polukhin, A., Gordienko, Y., Jervan, G., Stirenko, S. (2023). Object Detection for Rescue Operations by High-Altitude Infrared Thermal Imaging Collected by Unmanned Aerial Vehicles. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_39
Download citation
DOI: https://doi.org/10.1007/978-3-031-36616-1_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36615-4
Online ISBN: 978-3-031-36616-1
eBook Packages: Computer ScienceComputer Science (R0)