An improved deep learning method for flying object detection and recognition

Aote, Shailendra S.; Wankhade, Nisha; Pardhi, Aniket; Misra, Nidhi; Agrawal, Harsh; Potnurwar, Archana

doi:10.1007/s11760-023-02703-y

An improved deep learning method for flying object detection and recognition

Original Paper
Published: 22 August 2023

Volume 18, pages 143–152, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Shailendra S. Aote¹,
Nisha Wankhade²,
Aniket Pardhi¹,
Nidhi Misra¹,
Harsh Agrawal¹ &
…
Archana Potnurwar³

462 Accesses
1 Citation
Explore all metrics

Abstract

The demand for detecting and classifying unmanned aerial vehicle (UAV) objects like birds, planes, and drones is increasing in various fields such as the military, surveillance, etc. A distant object appears to be point size; hence, it is difficult to classify the far-way objects in an image. There is always a trade-off between correct object detection and a confidence value. It is important to detect and classify the object correctly with a high confidence value. This paper introduces a hybrid model based on the combination of a convolutional neural network (CNN) and long short-term memory (LSTM) to detect and classify a UAV. Initially, we presented a comparative study of different algorithms from the You Only Look Once (YOLO) family. We have also gathered and prepared a dataset of images from various sources like GitHub, the University of California Irvine Machine Learning Repository (UCI), and the International Conference on Computer Vision (ICCV) for experimentation. The proposed CNN-LSTM model extracts spatial characteristics of the input video sequence and the better memory capacity of LSTM provides best-memorized results for object detection. The Bayesian optimization is used for hyper-parameter tuning that makes the results of the proposed hybrid CNN-LSTM model more promising when compared to the other state-of-the-art algorithms like YOLO, R-CNN, faster R-CNN, SGD, and CNN. We have also presented the detection accuracy with varying distances. The proposed model performs best w.r.t. precision, recall, training and validation accuracy, and loss. The processing speed per second (FPS) is also nearly equivalent to faster R-CNN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object Detection in Surveillance Using Deep Learning Methods: A Comparative Analysis

Detection of Flying Objects Using the YOLOv4 Convolutional Neural Network

A Real-Time Deep UAV Detection Framework Based on a YOLOv8 Perception Module

Availability of data and materials

All the data are collected from well-known repositories on the Internet. The dataset prepared here will be available on request.

References

Sommer, L., Schumann, A., Müller, T., Schuchert, T. and Beyerer, J.: Flying object detection for automatic UAV recognition. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2017). doi: https://doi.org/10.1109/AVSS.2017.80785
Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger, arXiv:1612.08242, (2016)
Nepal, U., Eslamiat, H.: Comparing YOLOv3, YOLOv4 and YOLOv5 for autonomous landing spot detection in faulty UAVs. Sensors 22(2), 464 (2022)
Article Google Scholar
Bochkovskiy, A., Yao, C., Hong-Yuan, W., Liao, M.: YOLOv4: optimal speed and accuracy of object detection.arXiv:2004.10934, (2020)
Yao, G., Sun, Y., Wong, M., Lv, X.: A real-time detection method for concrete surface cracks based on improved YOLOv4. Symmetry 13(9), 1716 (2021). https://doi.org/10.3390/sym13091716
Article Google Scholar
Luo, S., Juan, Y., Xi, Y., Liao, X.: Aircraft target detection in remote sensing images based on improved YOLOv5. IEEE Access 10, 5184–5192 (2022). https://doi.org/10.1109/ACCESS.2022.3140876
Article Google Scholar
Nepal, U., Eslamiat, H.: Comparing YOLOv3, YOLOv4 and YOLOv5 for autonomous landing spot detection in faulty UAVs. Sensors 22(2), 464 (2022). https://doi.org/10.3390/s22020464
Article Google Scholar
Chien-Yao W., Alexey B., and Hong-Yuan M. L.: Institute of Information Science, Academia Sinica, Taiwan YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, arXiv:2207.02696, (2022)
Girshick, R., Donahue, J., Darrell, T. and Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, (2014).
Girshick. R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448 (2015).
Ren, S., He, K., Girshick, R. and Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Proces. Syst. 91–99 (2015).
Dewangan, D.K., Sahu, S.P.: Towards the design of vision-based intelligent vehicle system: methodologies and challenges. Evol. Intel. 16, 759–800 (2023). https://doi.org/10.1007/s12065-022-00713-2
Article Google Scholar
Roh, M. C. and Lee, J. Y.: "Refining faster-RCNN for accurate object detection. In: 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA) pp. 514–517 (2017). doi: https://doi.org/10.23919/MVA.2017.7986913.
Zhao, S., Liu, Y., Han, Y., Hong, R., Hu, Q., Tian, Q.: Pooling the convolutional layers in deep convnets for video action recognition. IEEE Trans. Circuit Syst. Video Technol. 28(8), 1839 (2018)
Article Google Scholar
Cao, C., Liu, X., Yang, Y. et al.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2956–2964, Santiago, Chile (2016).
Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L. and Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 961–971, Las Vegas, NV, USA (2016).
Wang, Y.: A new concept using LSTM Neural Networks for dynamic system identification. In: 2017 American Control Conference (ACC) pp. 5324-5329 (2017)https://doi.org/10.23919/ACC.2017.7963782
Ji, Y., Wang, L., Wu, W., Shao, H., Feng, Y.: A method for LSTM-based trajectory modeling and abnormal trajectory detection. IEEE Access 8, 104063–104073 (2020). https://doi.org/10.1109/ACCESS.2020.2997967
Article Google Scholar
Torvik, B., Olsen, K.E., Griffiths, H.: Classification of birds and uavs based on radar polarimetry. IEEE Geosci. Remote Sens. Lett. 13(9), 1305–1309 (2016)
Article Google Scholar
Mohajerin, N., Histon, J., Dizaji, R. and Waslander, S. L.: Feature extraction and radar track classification for detecting UAVs in civilian airspace. In: IEEE National Radar Conference - Proceedings, pp. 674–679, (2014).
Srigrarom, S., Hoe Chew, K., Meng Da Lee, D. and Ratsamee, P.: Drone versus Bird Flights: Classification by Trajectories Characterization. In: 2020 59th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), pp. 343–348 (2020), doi: https://doi.org/10.23919/SICE48898.2020.9240313.
Gers, F.: Long short-term memory in recurrent neural networks. Neural Comput. (2001)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int J Unc. Fuzz. Knowl. Based Syst 6, 107–116 (1998). https://doi.org/10.1142/S0218488598000094
Article Google Scholar
Chen, G.: A gentle tutorial of recurrent neural network with error backpropagation (2016), http://arxiv.org/abs/1610.02583
Chigozie, E.N., Winifred I., Anthony G., and Stephen M.: Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, (2018) arXiv:1811.03378v1
Ruby, U., Yendapalli, V.: Binary cross entropy with deep learning technique for Image classification. Int. J. Adv. Trends Comput. Sci. Eng. (2020). https://doi.org/10.30534/ijatcse/2020/175942020
Article Google Scholar
Wu, J., Chen, X.Y., Zhang, H., Xiong, L.D., Lei, H., Deng, S.H.: Hyperparameter optimization for machine learning models based on bayesian optimization. J. Electron. Sci. Technol. 17(1), 26–40 (2019). https://doi.org/10.11989/JEST.1674-862X.80904120
Article Google Scholar
Zhang, Y., Sohn, K., Villegas, R., Pan, G., Lee; H.: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), pp. 249–258
Dalal, N. and Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, pp. 886–893, IEEE (2005).
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Dewangan, D.K., Sahu, S.P.: Lane detection in intelligent vehicle system using optimal 2- tier deep convolutional neural network. Multimed. Tools Appl. 82, 7293–7317 (2023). https://doi.org/10.1007/s11042-022-13425-7
Article Google Scholar
Sahu, S., Sahu, S.P., Dewangan, D.K. (2023), “Pedestrian Detection Using MobileNetV2 Based Mask R-CNN” In: Joby, P.P., Balas, V.E., Palanisamy, R. (eds) IoT Based Control Networks and Intelligent Systems, Lecture Notes in Networks and Systems, vol 528. Springer, Singapore. https://doi.org/10.1007/978-981-19-5845-8_22
Sahu, S.S., Sahu, S.P., Dewangan, D.K.: Pedestrian detection using ResNet-101 based Mask R-CNN. In: AIP Conference Proceedings 2705 (1): 020008 (2023). https://doi.org/10.1063/5.0134276
Dewangan, D.K., Sahu, S.P.: Optimized convolutional neural network for road detection with structured contour and spatial information for intelligent vehicle system. Int. J. Patt. Recogn. Artif. Intell. (2022). https://doi.org/10.1142/S0218001422520024
Article Google Scholar

Download references

Funding

Authors state no funding is involved.

Author information

Authors and Affiliations

Shri Ramdeobaba College of Engineering and Management, Nagpur, India
Shailendra S. Aote, Aniket Pardhi, Nidhi Misra & Harsh Agrawal
Department of Information Technology, Yeshwantrao Chavan College of Engineering, Nagpur, India
Nisha Wankhade
Priyadarshini College of Engineering, Nagpur, India
Archana Potnurwar

Authors

Shailendra S. Aote
View author publications
You can also search for this author in PubMed Google Scholar
Nisha Wankhade
View author publications
You can also search for this author in PubMed Google Scholar
Aniket Pardhi
View author publications
You can also search for this author in PubMed Google Scholar
Nidhi Misra
View author publications
You can also search for this author in PubMed Google Scholar
Harsh Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Archana Potnurwar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SSA performed algorithm design and implementation, result analysis, and writing-original draft and editing. NW did literature survey, result analysis, and writing-reviewing and editing. AP done algorithm implementation, result analysis, and writing-reviewing and editing. NM provide YOLO experimentation and result analysis. HA gave introduction writing and proofreading.

Corresponding author

Correspondence to Shailendra S. Aote.

Ethics declarations

Competing interests

The authors declare that they have no relevant financial or non-financial/ competing interests to disclose in any material discussed in this article.

Ethical approval

The proposed study do not require ethical approval.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aote, S.S., Wankhade, N., Pardhi, A. et al. An improved deep learning method for flying object detection and recognition. SIViP 18, 143–152 (2024). https://doi.org/10.1007/s11760-023-02703-y

Download citation

Received: 30 June 2023
Revised: 08 July 2023
Accepted: 11 July 2023
Published: 22 August 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11760-023-02703-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved deep learning method for flying object detection and recognition

Abstract

Access this article

Similar content being viewed by others

Object Detection in Surveillance Using Deep Learning Methods: A Comparative Analysis

Detection of Flying Objects Using the YOLOv4 Convolutional Neural Network

A Real-Time Deep UAV Detection Framework Based on a YOLOv8 Perception Module

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An improved deep learning method for flying object detection and recognition

Abstract

Access this article

Similar content being viewed by others

Object Detection in Surveillance Using Deep Learning Methods: A Comparative Analysis

Detection of Flying Objects Using the YOLOv4 Convolutional Neural Network

A Real-Time Deep UAV Detection Framework Based on a YOLOv8 Perception Module

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation