Abstract
In recent developments, a lot has been done for computer vision applied to human action recognition and violence detection. Although various datasets are available for action and violence recognition, there is a clear lack of datasets that include non-violent and violent activities simultaneously from an aerial view. A new aerial video dataset for concurrent human action recognition, including violence detection, is presented in this study. It consists of 60 min of fully annotated data with two action classes, namely violent and normal (non-violent). The current dataset addresses various factors that are not considered in the existing datasets, like changes in the altitude of the drone, changes in the angle at which the video is being captured, video captured during motion, changes in frame rates, videos from different cameras with different configurations, multiple labels for every subject, and labels for violent activities. The resulting dataset is a multifaceted representation of the real-world scenarios, which addresses various shortfalls in the existing datasets. The current dataset will push forward computer vision applications for action recognition, particularly automated violence detection in real-time video streams from an aerial view. Furthermore, the curated dataset is validated for violence detection using machine and deep learning algorithms, namely Support Vector Machine (SVM), Long Short-Term Memory (LSTM), Bi-Directional LSTM (Bi-LSTM) and Adaptive Boosting (AdaBoost).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Azkune, G., Almeida, A., Lopez-de Ipi ´ na, D., Chen, L.: Combining users’ activity survey and simulators to evaluate human activity recognition systems. Sensors 15(4), 8192–8213 (2015)
Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)
Barekatain, M., Mart´ı, M., Shih, H.-F., Murray, S., Nakayama, K., Matsuo, Y., Prendinger, H.: Okutama-action: an aerial view video dataset for concurrent human action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 28–35 (2017)
Wang, H.-Y., Chang, Y.-C., Hsieh, Y.-Y., Chen, H.-T., Chuang, J.-H.: Deep learning-based human activity analysis for aerial images. In: 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 713–718. IEEE (2017)
Sargano, A.B., Angelov, P., Habib, Z.: A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl. Sci. 7(1), 110 (2017)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)
Rodriguez, M.: Spatio-temporal maximum average correlation height templates in action recognition and video summarization (2010)
Soomro, K., Zamir, A.R.: Action recognition in realistic sports videos. In: Computer Vision in Sports, pp. 181–208. Springer (2014)
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2929–2936. IEEE (2009)
Heilbron, F.C., Escorcia, V., Ghanem, B., Niebles, J.C.: Activitynet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970 (2015)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1996–2003. IEEE (2009)
Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–7. IEEE (2007)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011)
Moencks, M., De Silva, V., Roche, J., Kondoz, A.: Adaptive feature processing for robust human activity recognition on a novel multi-modal dataset. arXiv preprint arXiv:1901.02858 (2019)
Wijekoon, A., Wiratunga, N., Cooper, K.: MEx: multimodal exercises dataset for human activity recognition. arXiv preprint arXiv:1908.08992 (2019)
Singh, R., Sonawane, A., Srivastava, R.: Recent evolution of modern datasets for human activity recognition: a deep survey. Multimedia Syst. 26(2), 83–106 (2020)
Mou, L., Hua, Y., Jin, P., Zhu, X.X.: Event and activity recognition in aerial videos using deep neural networks and a new dataset. In: IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, pp. 952–955. IEEE (2020)
Mmereki, W., Jamisola, R.S., Mpoeleng, D., Petso, T.: Yolov3-based human activity recognition as viewed from a moving high-altitude aerial camera. In: 2021 7th International Conference on Automation, Robotics and Applications (ICARA), pp. 241–246. IEEE (2021)
Farhadi, A., Redmon, J.: Yolov3: an incremental improvement. Comput. Vis. Pattern Recogn. 1804 (2018)
Sultani, W., Shah, M.: Human action recognition in drone videos using a few aerial training examples. Comput. Vis. Image Underst. 206, 103186 (2021)
Singh, A., Patil, D., Omkar, S.N.: Eye in the sky: real-time drone surveillance system (DSS) for violent individuals identification using scatternet hybrid deep learning network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1629–1637 (2018)
Mliki, H., Bouhlel, F., Hammami, M.: Human activity recognition from UAV-captured video sequences. Pattern Recogn. 100, 107140 (2020)
Aviles-Cruz, C., Ferreyra-Ram ´ ´ırez, A., Zu´niga-L ˜ opez, A., Villegas-Cortez, J.: Coarse-fine convolutional deep-learning strategy for human activity recognition. Sensors 19(7), 1556 (2019)
Ajmal, M., Ahmad, F., Naseer, M., Jamjoom, M.: Recognizing human activities from video using weakly supervised contextual features. IEEE Access 7, 98420–98435 (2019)
Ramzan, M., Abid, A., Khan, H.U., Awan, S.M., Ismail, A., Ahmed, M., Ilyas, M., Mahmood, A.: A review on state-of-the-art violence detection techniques. IEEE Access 7, 107560–107575 (2019)
Aktı, S., Tataro ¨ glu, G.A., Ekenel, H.K.: Vision-based fight detection from surveillance cameras. In: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2019)
Jain, A., Vishwakarma, D.K.: State-of-the-arts violence detection using convnets. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 0813–0817. IEEE (2020)
Challa, S.K., Kumar, A., Semwal, V.B.: A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data. Vis. Comput. 1–15 (2021)
Pawar, K., Attar, V.: Application of deep learning for crowd anomaly detection from surveillance videos. In: 2021 11th International Conference on Cloud Computing, Data Science and Engineering (Confluence), pp 506–511. IEEE (2021)
Srivastava, A., Badal, T., Garg, A., Vidyarthi, A., Singh, R.: Recognizing human violent action using drone surveillance within real-time proximity. J. Real-Time Image Process. 1–13 (2021)
GStreamer. https://gstreamer.freedesktop.org/documentation/?gilanguage=c. Last accessed Dec 2022
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mahajan, N., Chauhan, A., Kumar, H., Kaushal, S., Singh, S. (2024). Drone Watch: A Novel Dataset for Violent Action Recognition from Aerial Videos. In: Jha, P.K., Tripathi, B., Natarajan, E., Sharma, H. (eds) Proceedings of Congress on Control, Robotics, and Mechatronics. CRM 2023. Smart Innovation, Systems and Technologies, vol 364. Springer, Singapore. https://doi.org/10.1007/978-981-99-5180-2_35
Download citation
DOI: https://doi.org/10.1007/978-981-99-5180-2_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5520-6
Online ISBN: 978-981-99-5180-2
eBook Packages: EngineeringEngineering (R0)