Drone Watch: A Novel Dataset for Violent Action Recognition from Aerial Videos

Mahajan, Nitish; Chauhan, Amita; Kumar, Harish; Kaushal, Sakshi; Singh, Sarbjeet

doi:10.1007/978-981-99-5180-2_35

Nitish Mahajan⁷,
Amita Chauhan⁷,
Harish Kumar⁷,
Sakshi Kaushal⁷ &
…
Sarbjeet Singh⁷

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 364))

Included in the following conference series:

Congress on Control, Robotics, and Mechatronics

167 Accesses

Abstract

In recent developments, a lot has been done for computer vision applied to human action recognition and violence detection. Although various datasets are available for action and violence recognition, there is a clear lack of datasets that include non-violent and violent activities simultaneously from an aerial view. A new aerial video dataset for concurrent human action recognition, including violence detection, is presented in this study. It consists of 60 min of fully annotated data with two action classes, namely violent and normal (non-violent). The current dataset addresses various factors that are not considered in the existing datasets, like changes in the altitude of the drone, changes in the angle at which the video is being captured, video captured during motion, changes in frame rates, videos from different cameras with different configurations, multiple labels for every subject, and labels for violent activities. The resulting dataset is a multifaceted representation of the real-world scenarios, which addresses various shortfalls in the existing datasets. The current dataset will push forward computer vision applications for action recognition, particularly automated violence detection in real-time video streams from an aerial view. Furthermore, the curated dataset is validated for violence detection using machine and deep learning algorithms, namely Support Vector Machine (SVM), Long Short-Term Memory (LSTM), Bi-Directional LSTM (Bi-LSTM) and Adaptive Boosting (AdaBoost).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Hardcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Azkune, G., Almeida, A., Lopez-de Ipi ´ na, D., Chen, L.: Combining users’ activity survey and simulators to evaluate human activity recognition systems. Sensors 15(4), 8192–8213 (2015)
Google Scholar
Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)
Google Scholar
Barekatain, M., Mart´ı, M., Shih, H.-F., Murray, S., Nakayama, K., Matsuo, Y., Prendinger, H.: Okutama-action: an aerial view video dataset for concurrent human action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 28–35 (2017)
Google Scholar
Wang, H.-Y., Chang, Y.-C., Hsieh, Y.-Y., Chen, H.-T., Chuang, J.-H.: Deep learning-based human activity analysis for aerial images. In: 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 713–718. IEEE (2017)
Google Scholar
Sargano, A.B., Angelov, P., Habib, Z.: A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl. Sci. 7(1), 110 (2017)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)
Google Scholar
Rodriguez, M.: Spatio-temporal maximum average correlation height templates in action recognition and video summarization (2010)
Google Scholar
Soomro, K., Zamir, A.R.: Action recognition in realistic sports videos. In: Computer Vision in Sports, pp. 181–208. Springer (2014)
Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2929–2936. IEEE (2009)
Google Scholar
Heilbron, F.C., Escorcia, V., Ghanem, B., Niebles, J.C.: Activitynet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970 (2015)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1996–2003. IEEE (2009)
Google Scholar
Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–7. IEEE (2007)
Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011)
Google Scholar
Moencks, M., De Silva, V., Roche, J., Kondoz, A.: Adaptive feature processing for robust human activity recognition on a novel multi-modal dataset. arXiv preprint arXiv:1901.02858 (2019)
Wijekoon, A., Wiratunga, N., Cooper, K.: MEx: multimodal exercises dataset for human activity recognition. arXiv preprint arXiv:1908.08992 (2019)
Singh, R., Sonawane, A., Srivastava, R.: Recent evolution of modern datasets for human activity recognition: a deep survey. Multimedia Syst. 26(2), 83–106 (2020)
Article Google Scholar
Mou, L., Hua, Y., Jin, P., Zhu, X.X.: Event and activity recognition in aerial videos using deep neural networks and a new dataset. In: IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, pp. 952–955. IEEE (2020)
Google Scholar
Mmereki, W., Jamisola, R.S., Mpoeleng, D., Petso, T.: Yolov3-based human activity recognition as viewed from a moving high-altitude aerial camera. In: 2021 7th International Conference on Automation, Robotics and Applications (ICARA), pp. 241–246. IEEE (2021)
Google Scholar
Farhadi, A., Redmon, J.: Yolov3: an incremental improvement. Comput. Vis. Pattern Recogn. 1804 (2018)
Google Scholar
Sultani, W., Shah, M.: Human action recognition in drone videos using a few aerial training examples. Comput. Vis. Image Underst. 206, 103186 (2021)
Article Google Scholar
Singh, A., Patil, D., Omkar, S.N.: Eye in the sky: real-time drone surveillance system (DSS) for violent individuals identification using scatternet hybrid deep learning network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1629–1637 (2018)
Google Scholar
Mliki, H., Bouhlel, F., Hammami, M.: Human activity recognition from UAV-captured video sequences. Pattern Recogn. 100, 107140 (2020)
Article Google Scholar
Aviles-Cruz, C., Ferreyra-Ram ´ ´ırez, A., Zu´niga-L ˜ opez, A., Villegas-Cortez, J.: Coarse-fine convolutional deep-learning strategy for human activity recognition. Sensors 19(7), 1556 (2019)
Google Scholar
Ajmal, M., Ahmad, F., Naseer, M., Jamjoom, M.: Recognizing human activities from video using weakly supervised contextual features. IEEE Access 7, 98420–98435 (2019)
Article Google Scholar
Ramzan, M., Abid, A., Khan, H.U., Awan, S.M., Ismail, A., Ahmed, M., Ilyas, M., Mahmood, A.: A review on state-of-the-art violence detection techniques. IEEE Access 7, 107560–107575 (2019)
Google Scholar
Aktı, S., Tataro ¨ glu, G.A., Ekenel, H.K.: Vision-based fight detection from surveillance cameras. In: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2019)
Google Scholar
Jain, A., Vishwakarma, D.K.: State-of-the-arts violence detection using convnets. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 0813–0817. IEEE (2020)
Google Scholar
Challa, S.K., Kumar, A., Semwal, V.B.: A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data. Vis. Comput. 1–15 (2021)
Google Scholar
Pawar, K., Attar, V.: Application of deep learning for crowd anomaly detection from surveillance videos. In: 2021 11th International Conference on Cloud Computing, Data Science and Engineering (Confluence), pp 506–511. IEEE (2021)
Google Scholar
Srivastava, A., Badal, T., Garg, A., Vidyarthi, A., Singh, R.: Recognizing human violent action using drone surveillance within real-time proximity. J. Real-Time Image Process. 1–13 (2021)
Google Scholar
GStreamer. https://gstreamer.freedesktop.org/documentation/?gilanguage=c. Last accessed Dec 2022

Download references

Author information

Authors and Affiliations

University Institute of Engineering and Technology, Panjab University, Chandigarh, India
Nitish Mahajan, Amita Chauhan, Harish Kumar, Sakshi Kaushal & Sarbjeet Singh

Authors

Nitish Mahajan
View author publications
You can also search for this author in PubMed Google Scholar
Amita Chauhan
View author publications
You can also search for this author in PubMed Google Scholar
Harish Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Sakshi Kaushal
View author publications
You can also search for this author in PubMed Google Scholar
Sarbjeet Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nitish Mahajan .

Editor information

Editors and Affiliations

Department of Mechanical and Industrial Engineering, Indian Institute of Technology Roorkee, Roorkee, India
Pradeep Kumar Jha
Department of Mechanical Engineering, Rajasthan Technical University, Kota, Rajasthan, India
Brijesh Tripathi
Department of Mechanical and Mechatronic Engineering, UCSI University, Cheras, Kuala Lumpur, Malaysia
Elango Natarajan
Department of Computer Engineering, Rajasthan Technical University, Kota, Rajasthan, India
Harish Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahajan, N., Chauhan, A., Kumar, H., Kaushal, S., Singh, S. (2024). Drone Watch: A Novel Dataset for Violent Action Recognition from Aerial Videos. In: Jha, P.K., Tripathi, B., Natarajan, E., Sharma, H. (eds) Proceedings of Congress on Control, Robotics, and Mechatronics. CRM 2023. Smart Innovation, Systems and Technologies, vol 364. Springer, Singapore. https://doi.org/10.1007/978-981-99-5180-2_35

Download citation

DOI: https://doi.org/10.1007/978-981-99-5180-2_35
Published: 10 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5520-6
Online ISBN: 978-981-99-5180-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics