Skip to main content

Efficient Deep Vision for Aerial Visual Understanding

  • Chapter
  • First Online:
Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing

Abstract

Unmanned Aerial Vehicles (UAVs) are becoming a growing necessity for a broad range of applications, such as emergency response, monitoring critical infrastructures, and disaster management. UAVs, due to their affordability and camera capabilities, have become a common mobile camera platform for these kinds of applications. Thus, visual perception by utilizing Convolutional Neural Networks (CNNs) and Deep Learning is a key necessity for UAV-based applications. The remarkable performance of deep neural networks (DNNs) for vision tasks comes at a cost of high computational demands where the problem is amplified in drone-based applications due to limited energy resource. To address these drawbacks, this chapter highlights some of the key techniques of making deep vision more efficient for such resource-constrained applications. The techniques include but are not limited to data selection and reduction, efficient neural network design, and hardware-oriented model optimization. Results on different use cases show that such techniques can provide improvements either when applied as standalone or in a combined manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Quad Core 1.2 GHz Broadcom 64bit CPU.

  2. 2.

    Samsung Exynos-5422 Cortex—A15 2 Ghz and Cortex—A7 Octa-core CPUs with Mali-T628 MP6 GPU.

  3. 3.

    Samsung Exynos-5422 Cortex—A15 2 Ghz and Cortex—A7 Octa-core CPUs with Mali-T628 MP6 GPU.

  4. 4.

    Quad Core 1.2 GHz Broadcom 64-bit CPU.

References

  1. Allison, R., Johnston, J., Craig, G., Jennings, S.: Airborne optical and thermal remote sensing for wildfire detection and monitoring. Sensors 16(8), 1310 (2016)

    Article  Google Scholar 

  2. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

    Google Scholar 

  3. Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E.Q., Shen, H., Cowan, M., Wang, L., Hu, Y., Ceze, L., Guestrin, C., Krishnamurthy, A.: TVM: An automated end-to-end optimizing compiler for deep learning. In: OSDI (2018)

    Google Scholar 

  4. Chen, X., Xiang, S., Liu, C.-L., Pan, C.-H.: Vehicle detection in satellite images by parallel deep convolutional neural networks. In: 2013 2nd IAPR Asian conference on pattern recognition, pp. 181–185. IEEE, New York (2013)

    Google Scholar 

  5. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017)

    Google Scholar 

  6. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  7. Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. CoRR, abs/2103.13630 (2021)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  9. Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.: Searching for mobilenetv3. In: The IEEE International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  10. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861 (2017)

    Google Scholar 

  11. Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1 mb model size. CoRR, abs/1602.07360 (2016)

    Google Scholar 

  12. Ignatov, A., Malivenko, G., Timofte, R.: Fast and accurate quantized camera scene detection on smartphones, mobile ai 2021 challenge: Report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 2558–2568 (2021)

    Google Scholar 

  13. Kim, E.J., Park, H.C., Ham, S.W., Kho, S.Y., Kim, D.K.: Extracting vehicle trajectories using unmanned aerial vehicles in congested traffic conditions. J. Adv. Transp. 2019, 1–16 (2019)

    Google Scholar 

  14. Kyrkou, C.: C3N̂et: End-to-end deep learning for efficient real-time visual active camera control. J. Real-Time Image Proc. 18(4), 1421–1433 (2021)

    Article  Google Scholar 

  15. Kyrkou, C., Christoforou, E.G., Timotheou, S., Theocharides, T., Panayiotou, C., Polycarpou, M.: Optimizing the detection performance of smart camera networks through a probabilistic image-based model. IEEE Trans. Circuits Syst. Video Technol. 28(5), 1197–1211 (2018)

    Article  Google Scholar 

  16. Kyrkou, C., Plastiras, G., Theocharides, T., Venieris, S.I., Bouganis, C.S.: DroNet: Efficient convolutional neural network detector for real-time UAV applications. In: 2018 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 967–972 (2018)

    Google Scholar 

  17. Kyrkou, C., Theocharides, T.: Deep-learning-based aerial image classification for emergency response applications using unmanned aerial vehicles. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 517–525 (2019)

    Google Scholar 

  18. Kyrkou, C., Theocharides, T.: EmergencyNet: Efficient aerial image classification for drone-based emergency monitoring using atrous convolutional feature fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13, 1687–1699 (2020)

    Article  Google Scholar 

  19. Kyrkou, C., Timotheou, S., Kolios, P., Theocharides, T., Panayiotou, C.G.: Optimized vision-directed deployment of UAVs for rapid traffic monitoring. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–6 (2018)

    Google Scholar 

  20. Lin, W.: Automating optimization of quantized deep learning models on CUDA (2019)

    Google Scholar 

  21. Lucas, B.D., Kanade, T., et al.: An iterative image registration technique with an application to stereo vision. Vancouver (1981)

    Google Scholar 

  22. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: ShuffleNet V2: Practical guidelines for efficient CNN architecture design. In: Computer Vision—ECCV 2018, pp. 122–138. Springer International Publishing, Cham (2018)

    Google Scholar 

  23. Makrigiorgis, R., Hadjittoouli, N., Kyrkou, C., Theocharides, T.: AirCamRTM: Enhancing vehicle detection for efficient aerial camera-based road traffic monitoring. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 3431–3440 (2022)

    Google Scholar 

  24. Murugan, D., Garg, A., Singh, D.: Development of an adaptive approach for precision agriculture monitoring with drone and satellite data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10(12), 5322–5328 (2017)

    Article  Google Scholar 

  25. Petrides, P., Kyrkou, C., Kolios, P., Theocharides, T., Panayiotou, C.: Towards a holistic performance evaluation framework for drone-based object detection. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 1785–1793 (2017)

    Google Scholar 

  26. Plastiras, G., Kyrkou, C., Theocharides, T.: Efficient convnet-based object detection for unmanned aerial vehicles by selective tile processing. In: Proceedings of the 12th International Conference on Distributed Smart Cameras (ICDSC ’18), New York, NY, USA (2018). Association for Computing Machinery, New York

    Google Scholar 

  27. Plastiras, G., Kyrkou, C., Theocharides, T.: EdgeNet: Balancing accuracy and performance for edge-based convolutional neural network object detectors. In: Proceedings of the 13th International Conference on Distributed Smart Cameras, pp. 1–6 (2019)

    Google Scholar 

  28. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

    Google Scholar 

  29. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

    Google Scholar 

  31. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)

    Google Scholar 

  32. Tan, M., Le, Q.V.: EfficientNet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 6105–6114. PMLR, New York (2019)

    Google Scholar 

  33. Zhao, Y., Ma, J., Li, X., Zhang, J.: Saliency detection and deep learning-based wildfire identification in UAV imagery. Sensors 18(3), 712 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

The project is co-financed by the European Regional Development Fund and the Republic of Cyprus through the Cyprus Research & Innovation Foundation (“RESTART 2016-2020” Program) (Grant No. INTEGRATED/0918/0056) (RONDA). This work was also supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 739551 (KIOS CoE) and from the Government of the Republic of Cyprus through the Directorate General for European Programs, Coordination, and Development.

Christos Kyrkou gratefully acknowledge the support of NVIDIA Corporation with the donation of the RTX A6000 GPU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theocharis Theocharides .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Makrigiorgis, R., Siddiqui, S., Kyrkou, C., Kolios, P., Theocharides, T. (2024). Efficient Deep Vision for Aerial Visual Understanding. In: Pasricha, S., Shafique, M. (eds) Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-40677-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40677-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40676-8

  • Online ISBN: 978-3-031-40677-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics