Nowadays, the world is witnessing a significant rise in the cases of both reported and unnoticed violations. As an answer to this rising menace, video surveillance can fill the gap of covering untapped actions which lead to violence, while also ensuring a secure life. In our everyday life, surveillance can be accomplished efficiently by activity classification from drone videos. The prominent fields that have employed this technology are police work, video categorization, biometrics, and human–computer interaction. So far, no public dataset is available for violent activity classification using drone surveillance. Hence, this work aims to look into the domain of machine-driven recognition and classification of human actions from drone videos. In this study, the dataset is created using drones from different heights for an unconstrained environment. The study begins by performing key-point extraction and generate 2D skeletons for the persons in the frame. These extracted key points are given as features in the classification module to recognize the actions. The classification models used in the proposed method are SVM (support vector machine) and Random Forest. Experimental results show that the SVM model with RBF (radial basis function) kernel for activity classification is more efficient when compared to the prior proposed approaches and other experimented models. The research work has also analyzed the run time performance of the proposed system and achieve its real-time performance.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Aydin, B.: Public acceptance of drones: knowledge, attitudes, and practice. Technol. Soc. 59(101), 180 (2019)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Cheng, K., Zhang, Y., He, X., Chen, W., Cheng, J., Lu, H.: Skeleton-based action recognition with shift graph convolutional network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 183–192 (2020)
Chuang, C.H., Hsieh, J.W., Tsai, L.W., Chen, S.Y., Fan, K.C.: Carried object detection using ratio histogram and its application to suspicious event analysis. IEEE Trans. Circuits Syst. Video Technol. 19(6), 911–916 (2009)
Deniz, O., Serrano, I., Bueno, G., Kim, T.K.: Fast violence detection in video. In: 2014 International Conference on Computer Vision Theory and Applications (VISAPP), IEEE, vol. 2, pp. 478–485 (2014)
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Fu, E.Y., Leong, H.V., Ngai, G., Chan, S.C.: Automatic fight detection in surveillance videos. In: Proceedings of the 14th International Conference on Advances in Mobile Computing and Multi Media (MoMM '16). Association for Computing Machinery, New York, NY, USA, PP. 225–234 (2016)
Goya, K., Zhang, X., Kitayama, K., Nagayama, I.: A method for automatic detection of crimes for public security by using motion analysis. In: 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IEEE, pp. 736–741 (2009)
Ha, S., Choi, S.: Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors. In: 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 381–388 (2016)
Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)
Kim, H., Lee, S., Jung, H.: Human activity recognition by using convolutional neural network. Int. J. Electr. Comput. Eng. 9(6), 5270 (2019)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lewis, P.: Cctv in the sky: police plan to use military-style spy drones. Guardian 23, 1 (2010)
Li, X., Choo Chuah, M.: Sbgar: semantics based group activity recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2876–2885 (2017)
Li, X., Chuah, M.C.: Rehar: robust and efficient human activity recognition. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, pp. 362–371 (2018)
Li, X., Zhang, C., Zhang, D.: Abandoned objects detection using double illumination invariant foreground masks. In: 2010 20th International Conference on Pattern Recognition, IEEE, pp. 436–439 (2010)
Liu, C., Ying, J., Han, F., Ruan, M.: Abnormal human activity recognition using Bayes classifier and convolutional neural network. In: 2018 IEEE 3rd International Conference on Signal and Image Processing (ICSIP), IEEE, pp. 33–37 (2018)
Liu, Z., Zhang, H., Chen, Z., Wang, Z., Ouyang, W.: Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 143–152 (2020)
Manzi, A., Fiorini, L., Limosani, R., Dario, P., Cavallo, F.: Two-person activity recognition using skeleton data. IET Comput. Vis. 12(1), 27–35 (2018)
Mumtaz, A., Sargano, A.B., Habib, Z.: Violence detection in surveillance videos with deep network using transfer learning. In: 2018 2nd European Conference on Electrical Engineering and Computer Science (EECS), IEEE, pp. 558–563 (2018)
Nievas, E.B., Suarez, O.D., García, G.B., Sukthankar, R.: Violence detection in video using computer vision techniques. In: International Conference on Computer Analysis of Images and Patterns, pp. 332–339. Springer (2011)
Ordóñez, F.J., Roggen, D.: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016)
Penmetsa, S., Minhuj, F., Singh, A., Omkar, S.: Autonomous uav for suspicious action detection using pictorial human pose estimation and classification. ELCVIA Electron. Lett. Comput. Vis. Image Anal. 13(1), 0018–0032 (2014)
Ramanathan, M., Yau, W.Y., Teoh, E.K.: Human action recognition with video data: research and evaluation challenges. IEEE Trans. Hum. Mach. Syst. 44(5), 650–663 (2014)
Ren, B., Liu, M., Ding, R., Liu, H.: A survey on 3d skeleton-based action recognition using learning method (2020). arXiv:2002.05907
Seebamrungsat, J., Praising, S., Riyamongkol, P.: Fire detection in the buildings using image processing. In: 2014 Third ICT International Student Project Conference (ICT-ISPC), IEEE, pp. 95–98 (2014)
Serrano Gracia, I., Deniz Suarez, O., Bueno Garcia, G., Kim, T.K.: Fast fight detection. PLoS One 10(4), e0120448 (2015)
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7912–7921 (2019)
Singh, A., Patil, D., Omkar, S.: Eye in the sky: real-time drone surveillance system (dss) for violent individuals identification using scatternet hybrid deep learning network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1629–1637 (2018)
Soomro, K., Zamir, A.R., Shah, M.: A dataset of 101 human action classes from videos in the wild. Center for Research in Computer Vision 2(11) (2012)
Ullah, A., Muhammad, K., Del Ser, J., Baik, S.W., de Albuquerque, V.H.C.: Activity recognition using temporal optical flow convolutional features and multilayer lstm. IEEE Trans. Ind. Electron. 66(12), 9692–9702 (2018)
Walters, W., Weber, J.: Ucav surveillance, high-tech masculinities and oriental others. In: Presentation to A Global Surveillance Society (2010)
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Van Gool, L.: Temporal segment networks: towards good practices for deep action recognition. In: European Conference on Computer Vision, pp. 20–36. Springer (2016)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI Conference on Artificial Intelligence (2018)
Zhang, B., Wang, L., Wang, Z., Qiao, Y., Wang, H.: Real-time action recognition with enhanced motion vector cnns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2718–2726 (2016)
Zhang, H.B., Zhang, Y.X., Zhong, B., Lei, Q., Yang, L., Du, J.X., Chen, D.S.: A comprehensive survey of vision-based human action recognition methods. Sensors 19(5), 1005 (2019)
Zhou, P., Ding, Q., Luo, H., Hou, X.: Violent interaction detection in video based on deep learning. In: Journal of Physics: Conference Series, vol 844, p. 012044. IOP Publishing (2017)
The authors of the manuscript would like to thank all the individuals who ever helped them in implementation of this project. The authors would also like to thank our organizations for giving us the opportunity to work in collaborative manner.
The author declares that there is no funding associated with this project.
Conflict of interest
The authors of this manuscript declare that there is no conflict of interest.
The author of this manuscript confirms that: (i) informed, written consent has been obtained from the relevant sources wherever is required; (ii) all procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1964 and its later amendments. (iii) the approval and/or informed consent were obtained by human subjects where ever is applicable.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Srivastava, A., Badal, T., Garg, A. et al. Recognizing human violent action using drone surveillance within real-time proximity. J Real-Time Image Proc 18, 1851–1863 (2021). https://doi.org/10.1007/s11554-021-01171-2
- Video surveillance
- Unconstrained environment
- Drone videos
- Key-point extraction
- Activity classification