Abstract
In order to detect violence through surveillance cameras, we provide a neural architecture which can sense violence and can be a measure to prevent any chaos. This architecture uses a pre-trained ResNet-50 model to extract features from the video frames and then feeds them further into a ConvLSTM block. We use a short-term difference of video frames to provide more robustness in order to get rid of occlusions and discrepancies. Convolutional neural networks allow us to get more concentrated spatio-temporal features in the frames, which aids the sequential nature of videos to be fed in LSTMs. The model incorporates a pre-trained convolutional neural network connected to convolutional LSTM layer. The model takes raw videos as an input, converts it into frames, and outputs a binary classification of violence or non-violence label. We have pre-processed the video frames using cropping, dark-edge removal, and other data augmentation techniques to make data get rid of unnecessary details. For evaluation of the performance of our proposed method, three standard public datasets were used, and accuracy as the metric evaluation is used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
F.D. De Souza, G.C. Chavez, E.A. do Valle Jr., A.D.A. Arajo, Violence detection in video using spatio-temporal features. in Graphics, Patterns and Images (SIBGRAPI), 2010 23rd SIBGRAPI Conference on (IEEE, 2010), pp. 224–230
P. Bilinski, F. Bremond, Human violence recognition and detection in surveillance videos. in 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (IEEE, 2016), pp. 30–36
A. Datta, M. Shah, N.D.V. Lobo, Person-on-person violence detection in video data. in Pattern Recognition, 2002. Proceedings. 16th Inter-national Conference on vol. 1 (IEEE, 2002), pp. 433–438
J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venu-gopalan, K. Saenko, T. Darrell, Long-term recurrent convolutional networks for visual recognition and description. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 2625–2634
T. Giannakopoulos, A. Pikrakis, S. Theodoridis, A multi-class audio classification method with respect to violent content in movies using bayesian networks. in Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on (IEEE, 2007), pp. 90–93
I.S. Gracia, O.D. Suarez, G.B. Garcia, T.K. Kim, Fast ght detection. PLoS ONE 10(4), e0120448 (2015)
T. Hassner, Y. Itcher, O. Kliper-Gross, Violent flows: real-time detection of violent crowd behavior. in Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on (IEEE, 2012), pp. 1–6
S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks. in Advances in Neural Information Processing Systems (2012), pp. 1097–1105
J.R. Medel, A. Savakis, Anomaly detection in video using predictive convolutional long short-term memory networks (2016). arXiv preprint arXiv:1612.00390
S. Mohammadi, H. Kiani, A. Perina, V. Murino, Violence detection in crowded scenes using substantial derivative. in 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (IEEE, 2015), pp. 1–6
E.B. Nievas, O.D. Suarez, G.B. Garca, R. Sukthankar, Violence detection in video using computer vision techniques. in International Conference on Computer Analysis of Images and Patterns (Springer, Berlin, Heidelberg, 2011), pp. 332–339
P. Rota, N. Conci, N. Sebe, J.M. Rehg, Real-life violent social interaction detection. in Image Processing (ICIP), 2015 IEEE International Conference on (IEEE, 2015), pp. 3456–3460
I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks. in Advances in Neural Information Processing Systems (2014), pp. 3104–3112
K. Simonyan, A. Zisserman, Two-stream convolutional networks for action recognition in videos. in Advances in Neural Information Processing Systems (2014), pp. 568–576
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sharma, M., Baghel, R. (2020). Video Surveillance for Violence Detection Using Deep Learning. In: Borah, S., Emilia Balas, V., Polkowski, Z. (eds) Advances in Data Science and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 37. Springer, Singapore. https://doi.org/10.1007/978-981-15-0978-0_40
Download citation
DOI: https://doi.org/10.1007/978-981-15-0978-0_40
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0977-3
Online ISBN: 978-981-15-0978-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)