Skip to main content

Advertisement

Log in

TS-MDA: two-stream multiscale deep architecture for crowd behavior prediction

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

In recent years, crowd behavior prediction (CBP) has gained much attention from academics and helps to control crowd disasters. The CBP has been solved either as one-class classification (OCC) or multi-class classification (MCC) problems. The OCC-based CBP models learn the normal crowd behavior patterns and treat outliers as anomalies or abnormal crowd behaviors. Nevertheless, these models do not consider the differences in anomaly types and interpret them as one class. On the other hand, the MCC-based CBP models overcome such drawbacks. However, very few datasets and models have been proposed. The current state-of-the-art MCC-based CBP approaches exploit spatial–temporal features but lack in addressing two crucial challenges in the crowd scenes: (a) human-scale variation due to perspective distortion and (b) minimizing effects of cluttered background. To this end, an end-to-end trainable two-stream multiscale deep architecture has been proposed for MCC-based CBP. The first stream uses a deep convolution neural network to extract multiscale spatial features from the frames to handle human-scale variation. The second stream extracts multiscale temporal features from de-background frames using a multi-layer dilated convolution long short-term memory. The effect of the cluttered background has been minimized by extracting de-background frames by adopting a visual background extractor algorithm. The multiscale features from the two streams are concatenated and used to classify different crowd behaviors. The experiments are manifested on two large-scale crowd behavior datasets: MED and GTA. The experimental results show that the proposed model performs better than the state-of-the-art MCC-based CBP approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 FPS in MATLAB. Proc IEEE Int. Conf. Comput. Vis. (2013). https://doi.org/10.1109/ICCV.2013.338

    Article  Google Scholar 

  2. Cheng, K.W., Chen, Y.T., Fang, W.H.: Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2015). https://doi.org/10.1109/CVPR.2015.7298909

    Article  MATH  Google Scholar 

  3. Saligrama, V., Chen, Z.: Video anomaly detection based on local statistical aggregates. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2012). https://doi.org/10.1109/CVPR.2012.6247917

    Article  Google Scholar 

  4. Lamba, S., Nain, N.: Detecting anomalous crowd scenes by oriented Tracklets’ approach in active contour region. Multimed. Tools Appl. 78, 31101–31120 (2019). https://doi.org/10.1007/s11042-019-07806-8

    Article  Google Scholar 

  5. Zhou, S., Shen, W., Zeng, D., et al.: Spatial-temporal convolutional neural networks for anomaly detection and localization in crowded scenes. Signal Process. Image Commun. 47, 358–368 (2016). https://doi.org/10.1016/j.image.2016.06.007

    Article  Google Scholar 

  6. Bouindour, S., Hittawe, M.M., Mahfouz, S., Snoussi, H.: Abnormal event detection using convolutional neural networks and 1-Class SVM classifier. 1–6 (2018). https://doi.org/10.1049/ic.2017.0040

  7. Smeureanu, S., Ionescu, R.T., Popescu, M., Alexe, B.: Deep appearance features for abnormal behavior detection in video. In: Image Analysis and Processing—ICIAP 2017 (2017)

  8. Ravanbakhsh, M., Nabi, M., Mousavi, H., et al.: Plug-and-play CNN for crowd motion analysis: an application in abnormal event detection. In: Proc—2018 IEEE Winter Conf. Appl. Comput. Vision, WACV 2018-Janua, pp. 1689–1698. https://doi.org/10.1109/WACV.2018.00188 (2018)

  9. Bouindour, S., Snoussi, H., Hittawe, M., et al.: An on-line and adaptive method for detecting abnormal events in videos using spatio-temporal ConvNet. Appl. Sci. 9, 757 (2019). https://doi.org/10.3390/app9040757

    Article  Google Scholar 

  10. Song, W., Zhang, D., Zhao, X., et al.: A novel violent video detection scheme based on modified 3D convolutional neural networks. IEEE Access 7, 39172–39179 (2019). https://doi.org/10.1109/ACCESS.2019.2906275

    Article  Google Scholar 

  11. Dinesh Jackson, S.R., Fenil, E., Gunasekaran, M., et al.: Real time violence detection framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM. Comput. Netw. 151, 191–200 (2019). https://doi.org/10.1016/j.comnet.2019.01.028

    Article  Google Scholar 

  12. Sabokrou, M., Fathy, M., Hoseini, M., Klette, R.: Real-time anomaly detection and localization in crowded scenes. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work (2015). https://doi.org/10.1109/CVPRW.2015.7301284

    Article  Google Scholar 

  13. Xu, D., Ricci, E., Yan, Y., et al.: Learning deep representations of appearance and motion for anomalous event detection. Proc. Br. Mach. Vis. Conf. (2015). https://doi.org/10.5244/C.29.8

    Article  Google Scholar 

  14. George, M., Jose, B.R., Mathew, J., Kokare, P.: Autoencoder-based abnormal activity detection using parallelepiped spatio-temporal region. IET Comput. Vis. 13, 23–30 (2018). https://doi.org/10.1049/iet-cvi.2018.5240

    Article  Google Scholar 

  15. Tran, H.T.M., Hogg, D.: Anomaly detection using a convolutional autoencoder. Winner-take-all (2017)

  16. Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatiotemporal autoencoder. Lect. Notes Comput. Sci. (Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinform.) 10262, 189–196 (2017). https://doi.org/10.1007/978-3-319-59081-3_23

    Article  Google Scholar 

  17. Sabokrou, M., Fayyaz, M., Fathy, M., Klette, R.: Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26, 1992–2004 (2017). https://doi.org/10.1109/TIP.2017.2670780

    Article  MathSciNet  MATH  Google Scholar 

  18. Ravanbakhsh, M., Nabi, M., Sangineto, E., et al.: Abnormal event detection in videos using generative adversarial nets. In: ICIP, pp. 1577–1581. (2017). https://doi.org/10.1109/ICIP.2017.8296547.

  19. Ravanbakhsh, M., Sangineto, E., Nabi, M., Sebe, N.: Training adversarial discriminators for cross-channel abnormal event detection in crowds. In: Proc—2019 IEEE Winter Conf. Appl. Comput. Vision, WACV, 2019, pp. 1896–1904. https://doi.org/10.1109/WACV.2019.00206 (2019)

  20. Zhuang, N.: Convolutional DLSTM for crowd scene understanding. https://doi.org/10.1109/ISM.2017.19 (2017)

  21. Yang, B., Cao, J., Wang, N., Liu, X.: Anomalous behaviors detection in moving crowds based on a weighted convolutional autoencoder-long short-term memory network. IEEE Trans. Cogn. Dev. Syst. (2018). https://doi.org/10.1109/TCDS.2018.2866838

    Article  Google Scholar 

  22. H. Rabiee, J. Haddadnia, H. Mousavi, M. Kalantarzadeh, M. Nabi and V. Murino, Novel dataset for fine-grained abnormal behavior understanding in crowd. In: 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 95-101 (2016). https://doi.org/10.1109/AVSS.2016.7738074

  23. Lazaridis, L., Dimou, A., Daras, P.: Abnormal behavior detection in crowded scenes using density heatmaps and optical flow. Eur. Signal Process. Conf. (2018). https://doi.org/10.23919/EUSIPCO.2018.8553620

    Article  Google Scholar 

  24. Dupont, C., Tobias, L., Luvison, B.: Crowd-11: a dataset for fine grained crowd behaviour analysis. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work 2017-July, pp. 2184–2191. https://doi.org/10.1109/CVPRW.2017.271 (2017)

  25. Sindagi, V.A., Patel, V.M.: HA-CCN: hierarchical attention-based crowd counting network. IEEE Trans. Image Process. 29, 323–335 (2020). https://doi.org/10.1109/TIP.2019.2928634

    Article  MathSciNet  MATH  Google Scholar 

  26. Tripathy, S.K., Srivastava, R.: A real-time two-input stream multi-column multi-stage convolution neural network (TIS-MCMS-CNN) for efficient crowd congestion-level analysis. Multimed. Syst. 26, 585–605 (2020). https://doi.org/10.1007/s00530-020-00667-4

    Article  Google Scholar 

  27. Aldissi, B., Ammar, H.: Real-time frequency-based detection of a panic behavior in human crowds. Multimed. Tools Appl. 79, 24851–24871 (2020). https://doi.org/10.1007/s11042-020-09024-z

    Article  Google Scholar 

  28. Singh, G., Khosla, A., Kapoor, R.: Crowd escape event detection via pooling features of optical flow for intelligent video surveillance systems. Int. J. Image Graph Signal Process. 11, 40–49 (2019). https://doi.org/10.5815/ijigsp.2019.10.06

    Article  Google Scholar 

  29. Sabokrou, M., Fayyaz, M., Fathy, M., et al.: Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput. Vis. Image Underst. 172, 88–97 (2018). https://doi.org/10.1016/j.cviu.2018.02.006

    Article  MATH  Google Scholar 

  30. Huang, S., Huang, D., Zhou, X.: Learning multimodal deep representations for crowd anomaly event detection. Math Probl Eng (2018). https://doi.org/10.1155/2018/6323942

    Article  Google Scholar 

  31. Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net (2014)

  32. Ammar, H., Cherif, A.: DeepROD: a deep learning approach for real-time and online detection of a panic behavior in human crowds. Mach. Vis. Appl. (2021). https://doi.org/10.1007/s00138-021-01182-w

    Article  Google Scholar 

  33. Ribeiro, M., Lazzaretti, A.E., Lopes, H.S.: A study of deep convolutional auto-encoders for anomaly detection in videos. Pattern Recognit. Lett. 105, 13–22 (2018). https://doi.org/10.1016/j.patrec.2017.07.016

    Article  Google Scholar 

  34. Gutoski, M., Marcelo, N., Aquino, R., et al.: Detection of video anomalies using convolutional autoencoders and one-class support vector machines. In: XIII Brazilian Congr. Comput. Intell. 2017 (2017)

  35. Sang, J., Wu, W., Luo, H., et al.: Improved crowd counting method based on scale-adaptive convolutional neural network. IEEE Access 7, 24411–24419 (2019). https://doi.org/10.1109/ACCESS.2019.2899939

    Article  Google Scholar 

  36. Barnich, O., Van Droogenbroeck, M.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20, 1709–1724 (2011). https://doi.org/10.1109/TIP.2010.2101613

    Article  MathSciNet  MATH  Google Scholar 

  37. Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 221–231 (2013). https://doi.org/10.1109/TPAMI.2012.59

    Article  Google Scholar 

  38. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. 1–15 (2014)

Download references

Acknowledgements

The support and the resources provided by ‘PARAM Shivay Facility' under the National Supercomputing Mission, Government of India at the Indian Institute of Technology, Varanasi, are gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santosh Kumar Tripathy.

Additional information

Communicated by Ichiro IDE.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tripathy, S.K., Kostha, H. & Srivastava, R. TS-MDA: two-stream multiscale deep architecture for crowd behavior prediction. Multimedia Systems 29, 15–31 (2023). https://doi.org/10.1007/s00530-022-00975-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00975-x

Keywords

Navigation