Efficient Deep Vision for Aerial Visual Understanding

Makrigiorgis, Rafael; Siddiqui, Shahid; Kyrkou, Christos; Kolios, Panayiotis; Theocharides, Theocharis

doi:10.1007/978-3-031-40677-5_4

Rafael Makrigiorgis³,
Shahid Siddiqui³,
Christos Kyrkou³,
Panayiotis Kolios³ &
…
Theocharis Theocharides³

309 Accesses
1 Altmetric

Abstract

Unmanned Aerial Vehicles (UAVs) are becoming a growing necessity for a broad range of applications, such as emergency response, monitoring critical infrastructures, and disaster management. UAVs, due to their affordability and camera capabilities, have become a common mobile camera platform for these kinds of applications. Thus, visual perception by utilizing Convolutional Neural Networks (CNNs) and Deep Learning is a key necessity for UAV-based applications. The remarkable performance of deep neural networks (DNNs) for vision tasks comes at a cost of high computational demands where the problem is amplified in drone-based applications due to limited energy resource. To address these drawbacks, this chapter highlights some of the key techniques of making deep vision more efficient for such resource-constrained applications. The techniques include but are not limited to data selection and reduction, efficient neural network design, and hardware-oriented model optimization. Results on different use cases show that such techniques can provide improvements either when applied as standalone or in a combined manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Quad Core 1.2 GHz Broadcom 64bit CPU.
2.
Samsung Exynos-5422 Cortex—A15 2 Ghz and Cortex—A7 Octa-core CPUs with Mali-T628 MP6 GPU.
3.
Samsung Exynos-5422 Cortex—A15 2 Ghz and Cortex—A7 Octa-core CPUs with Mali-T628 MP6 GPU.
4.
Quad Core 1.2 GHz Broadcom 64-bit CPU.

References

Allison, R., Johnston, J., Craig, G., Jennings, S.: Airborne optical and thermal remote sensing for wildfire detection and monitoring. Sensors 16(8), 1310 (2016)
Article Google Scholar
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Google Scholar
Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E.Q., Shen, H., Cowan, M., Wang, L., Hu, Y., Ceze, L., Guestrin, C., Krishnamurthy, A.: TVM: An automated end-to-end optimizing compiler for deep learning. In: OSDI (2018)
Google Scholar
Chen, X., Xiang, S., Liu, C.-L., Pan, C.-H.: Vehicle detection in satellite images by parallel deep convolutional neural networks. In: 2013 2nd IAPR Asian conference on pattern recognition, pp. 181–185. IEEE, New York (2013)
Google Scholar
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. CoRR, abs/2103.13630 (2021)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.: Searching for mobilenetv3. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861 (2017)
Google Scholar
Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1 mb model size. CoRR, abs/1602.07360 (2016)
Google Scholar
Ignatov, A., Malivenko, G., Timofte, R.: Fast and accurate quantized camera scene detection on smartphones, mobile ai 2021 challenge: Report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 2558–2568 (2021)
Google Scholar
Kim, E.J., Park, H.C., Ham, S.W., Kho, S.Y., Kim, D.K.: Extracting vehicle trajectories using unmanned aerial vehicles in congested traffic conditions. J. Adv. Transp. 2019, 1–16 (2019)
Google Scholar
Kyrkou, C.: C3N̂et: End-to-end deep learning for efficient real-time visual active camera control. J. Real-Time Image Proc. 18(4), 1421–1433 (2021)
Article Google Scholar
Kyrkou, C., Christoforou, E.G., Timotheou, S., Theocharides, T., Panayiotou, C., Polycarpou, M.: Optimizing the detection performance of smart camera networks through a probabilistic image-based model. IEEE Trans. Circuits Syst. Video Technol. 28(5), 1197–1211 (2018)
Article Google Scholar
Kyrkou, C., Plastiras, G., Theocharides, T., Venieris, S.I., Bouganis, C.S.: DroNet: Efficient convolutional neural network detector for real-time UAV applications. In: 2018 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 967–972 (2018)
Google Scholar
Kyrkou, C., Theocharides, T.: Deep-learning-based aerial image classification for emergency response applications using unmanned aerial vehicles. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 517–525 (2019)
Google Scholar
Kyrkou, C., Theocharides, T.: EmergencyNet: Efficient aerial image classification for drone-based emergency monitoring using atrous convolutional feature fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13, 1687–1699 (2020)
Article Google Scholar
Kyrkou, C., Timotheou, S., Kolios, P., Theocharides, T., Panayiotou, C.G.: Optimized vision-directed deployment of UAVs for rapid traffic monitoring. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–6 (2018)
Google Scholar
Lin, W.: Automating optimization of quantized deep learning models on CUDA (2019)
Google Scholar
Lucas, B.D., Kanade, T., et al.: An iterative image registration technique with an application to stereo vision. Vancouver (1981)
Google Scholar
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: ShuffleNet V2: Practical guidelines for efficient CNN architecture design. In: Computer Vision—ECCV 2018, pp. 122–138. Springer International Publishing, Cham (2018)
Google Scholar
Makrigiorgis, R., Hadjittoouli, N., Kyrkou, C., Theocharides, T.: AirCamRTM: Enhancing vehicle detection for efficient aerial camera-based road traffic monitoring. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 3431–3440 (2022)
Google Scholar
Murugan, D., Garg, A., Singh, D.: Development of an adaptive approach for precision agriculture monitoring with drone and satellite data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10(12), 5322–5328 (2017)
Article Google Scholar
Petrides, P., Kyrkou, C., Kolios, P., Theocharides, T., Panayiotou, C.: Towards a holistic performance evaluation framework for drone-based object detection. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 1785–1793 (2017)
Google Scholar
Plastiras, G., Kyrkou, C., Theocharides, T.: Efficient convnet-based object detection for unmanned aerial vehicles by selective tile processing. In: Proceedings of the 12th International Conference on Distributed Smart Cameras (ICDSC ’18), New York, NY, USA (2018). Association for Computing Machinery, New York
Google Scholar
Plastiras, G., Kyrkou, C., Theocharides, T.: EdgeNet: Balancing accuracy and performance for edge-based convolutional neural network object detectors. In: Proceedings of the 13th International Conference on Distributed Smart Cameras, pp. 1–6 (2019)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)
Google Scholar
Tan, M., Le, Q.V.: EfficientNet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 6105–6114. PMLR, New York (2019)
Google Scholar
Zhao, Y., Ma, J., Li, X., Zhang, J.: Saliency detection and deep learning-based wildfire identification in UAV imagery. Sensors 18(3), 712 (2018)
Article Google Scholar

Download references

Acknowledgements

The project is co-financed by the European Regional Development Fund and the Republic of Cyprus through the Cyprus Research & Innovation Foundation (“RESTART 2016-2020” Program) (Grant No. INTEGRATED/0918/0056) (RONDA). This work was also supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 739551 (KIOS CoE) and from the Government of the Republic of Cyprus through the Directorate General for European Programs, Coordination, and Development.

Christos Kyrkou gratefully acknowledge the support of NVIDIA Corporation with the donation of the RTX A6000 GPU.

Author information

Authors and Affiliations

KIOS Research and Innovation Center of Excellence, University of Cyprus, Nicosia, Cyprus
Rafael Makrigiorgis, Shahid Siddiqui, Christos Kyrkou, Panayiotis Kolios & Theocharis Theocharides

Authors

Rafael Makrigiorgis
View author publications
You can also search for this author in PubMed Google Scholar
Shahid Siddiqui
View author publications
You can also search for this author in PubMed Google Scholar
Christos Kyrkou
View author publications
You can also search for this author in PubMed Google Scholar
Panayiotis Kolios
View author publications
You can also search for this author in PubMed Google Scholar
Theocharis Theocharides
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Theocharis Theocharides .

Editor information

Editors and Affiliations

Colorado State University, Fort Collins, CO, USA
Sudeep Pasricha
New York University Abu Dhabi, Abu Dhabi, Abu Dhabi, United Arab Emirates
Muhammad Shafique

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Makrigiorgis, R., Siddiqui, S., Kyrkou, C., Kolios, P., Theocharides, T. (2024). Efficient Deep Vision for Aerial Visual Understanding. In: Pasricha, S., Shafique, M. (eds) Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-40677-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-40677-5_4
Published: 07 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40676-8
Online ISBN: 978-3-031-40677-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics