Abstract
Unexpected intrusions by pedestrians at grade crossings pose significant risks to the safety of railroad operations. Currently, there is no information processing system available for monitoring anomalies at grade crossings. Therefore, this paper presents a video processing pipeline and a generative adversarial network (GAN)-based deep learning framework to detect, localize, and analyze abnormal behaviors of pedestrians at grade crossings. First, the motions of pedestrians are represented by temporally-varying trajectories of key points identified by skeleton detection and tracking algorithms. A GAN model is developed to learn both global and local motion features of normal pedestrians only. The abnormal behaviors can then be detected as outliers by a discriminator during the testing phase. In contrast to existing efforts, several measures are taken to further boost model performance, including purposely exploiting overfitting in training to magnify the difference between the normal and the abnormal motion patterns and adding an appropriate amount of noise to enhance model generalization. The experiments conducted on a custom video dataset demonstrate the remarkable performance of our model, which successfully distinguishes motion patterns of squatting and lingering from normal walking. The model achieves a value of 0.89 in the AUC (Area Under the Curve) and notably outperforms the other seven benchmark models. The present method is also able to analyze multiple pedestrians in one video frame with a single run of the GAN model and requires no location-specific information, enabling salient robustness and field-deployability of the model at different locations without retraining.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
FRA: National strategy to prevent trespassing on railroad property (2018)
FRA official website, https://explore.dot.gov/t/FRA/views/TrespassandSuicideDashboard/TrespassOverview
Chen D, Yue L, Chang X, Xu M, Jia T (2021) Nm-gan: Noise-modulated generative adversarial network for video anomaly detection. Pattern Recognition 116:107969
Sabokrou, M., Khalooei, M., Fathy, M., Adeli, E.: Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3379–3388 (2018)
Li N, Chang F, Liu C (2020) Spatial-temporal cascade autoencoder for video anomaly detection in crowded scenes. IEEE Transactions on Multimedia 23:203–215
Deepak K, Chandrakala S, Mohan CK (2021) Residual spatiotemporal autoencoder for unsupervised video anomaly detection. Signal, Image and Video Processing 15(1):215–222
Fan Y, Wen G, Li D, Qiu S, Levine MD, Xiao F (2020) Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Computer Vision and Image Understanding 195:102920
Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14372–14381 (2020)
Le, V.-T., Kim, Y.-G.: Attention-based residual autoencoder for video anomaly detection. Applied Intelligence, 1–15 (2022)
Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using lstms. In: International Conference on Machine Learning, pp. 843–852 (2015). PMLR
Jiang, Z., Song, G., Qian, Y., Wang, Y.: A deep learning framework for detecting and localizing abnormal pedestrian behaviors at grade crossings. Neural Computing and Applications, 1–15 (2022)
Wu, G., He, F., Zhou, Y., Jing, Y., Ning, X., Wang, C., Jin, B.: Acgan: Age-compensated makeup transfer based on homologous continuity generative adversarial network model. IET Computer Vision (2022)
Ning X, Gou D, Dong X, Tian W, Yu L, Wang C (2022) Conditional generative adversarial networks based on the principle of homologycontinuity for face aging. Concurrency and Computation: Practice and Experience 34(12):5792
Ning, X., Xu, S., Nan, F., Zeng, Q., Wang, C., Cai, W., Li, W., Jiang, Y.: Face editing based on facial recognition features. IEEE Transactions on Cognitive and Developmental Systems (2022)
Wu W, Zhang S, Zhou K, Yang J, Wu X, Wan Y (2021) Shadow removal via dual module network and low error shadow dataset. Computers & Graphics 95:156–163
Wu, W., Zhang, S., Tian, M., Tan, D., Wu, X., Wan, Y.: Learning to detect soft shadow from limited data. The Visual Computer, 1–11 (2022)
Zhang, Q., Feng, G., Wu, H.: Surveillance video anomaly detection via non-local u-net frame prediction. Multimedia Tools and Applications, 1–16 (2022)
Li D, Nie X, Li X, Zhang Y, Yin Y (2022) Context-related video anomaly detection via generative adversarial network. Pattern Recognition Letters 156:183–189
Samuel, D.J., Cuzzolin, F.: Svd-gan for real-time unsupervised video anomaly detection (2021)
Perera, P., Nallapati, R., Xiang, B.: Ocgan: One-class novelty detection using gans with constrained latent representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2898–2906 (2019)
Dong F, Zhang Y, Nie X (2020) Dual discriminator generative adversarial network for video anomaly detection. IEEE Access 8:88170–88176
Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Fang, H.-S., Xie, S., Tai, Y.-W., Lu, C.: Rmpe: Regional multi-person pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2334–2343 (2017)
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2d human pose estimation: New benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
MSCOCO keypoint challenge 2016, http://mscoco.org/dataset/keypoints-challenge2016 (2016)
Yang, C., Yu, Z., Chen, F. https://github.com/ChengeYang/Human-Pose-Estimation-Benchmarking-and-Action-Recognition (2019)
Xiu, Y., Li, J., Wang, H., Fang, Y., Lu, C.: Pose flow: Efficient online pose tracking. arXiv preprint arXiv:1802.00977 (2018)
Morais, R., Le, V., Tran, T., Saha, B., Mansour, M., Venkatesh, S.: Learning regularity in skeleton trajectories for anomaly detection in videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11996–12004 (2019)
Du Y, Fu Y, Wang L (2016) Representation learning of temporal dynamics for skeleton-based action recognition. IEEE Transactions on Image Processing 25(7):3010–3022
Qiu, Z.-X., Zhang, H.-B., Deng, W.-M., Du, J.-X., Lei, Q., Zhang, G.-L.: Effective skeleton topology and semantics-guided adaptive graph convolution network for action recognition. The Visual Computer, 1–13 (2022)
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2016) Lstm: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems 28(10):2222–2232
Chen Y, Rao M, Feng K, Zuo MJ (2022) Physics-informed lstm hyperparameters selection for gearbox fault detection. Mechanical Systems and Signal Processing 171:108907
Doshi, K., Yilmaz, Y.: Towards interpretable video anomaly detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2655–2664 (2023)
Chang Y, Tu Z, Xie W, Luo B, Zhang S, Sui H, Yuan J (2022) Video anomaly detection with spatio-temporal dissociation. Pattern Recognition 122:108213
Pang W, He Q, Li Y (2022) Predicting skeleton trajectories using a skeleton-transformer for video anomaly detection. Multimedia Systems 28(4):1481–1494
Chan A, Vasconcelos N (2008) Ucsd pedestrian dataset. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 30(5):909–926
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)
Woźniak, M., Wieczorek, M., Siłka, J.: Deep neural network with transfer learning in remote object detection from drone. In: Proceedings of the 5th International ACM Mobicom Workshop on Drone Assisted Wireless Communications for 5G and Beyond, pp. 121–126 (2022)
Woźniak, M., Siłka, J., Wieczorek, M.: Deep learning based crowd counting model for drone assisted systems. In: Proceedings of the 4th ACM MobiCom Workshop on Drone Assisted Wireless Communications for 5G and Beyond, pp. 31–36 (2021)
Acknowledgements
This research is partially funded by the Federal Railroad Administration (FRA), Contract No. 693JJ620C000021. Dr. Shala Blue, Mr. Francesco Bedini, Mr. Michael Jones, and Dr. Starr Kidda from FRA have provided essential guidance and insight. The City of Columbia, especially the Columbia Fire Department, Department of Transportation, and 911 Dispatching Center; and CSX have provided tremendous help. The opinions expressed in this article are solely those of the authors and do not represent the opinions of the funding agencies. Mr. Thomas Johnson, Mr. Tianqi Huang, and Dr. Zhuocheng Jiang made a significant contribution to imagery data generation.
Author information
Authors and Affiliations
Contributions
Conceptualization: Ge Song, Yu Qian, Yi Wang; Methodology: Ge Song; Formal analysis and investigation: Ge Song, Yu Qian, Yi Wang; Writing - original draft preparation: Ge Song; Writing - review and editing: Yu Qian, Yi Wang; Funding acquisition: Yu Qian.
Corresponding author
Ethics declarations
Ethical and informed consent
Informed consent was obtained from all individual participants included in the study.
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Song, G., Qian, Y. & Wang, Y. Analysis of abnormal pedestrian behaviors at grade crossings based on semi-supervised generative adversarial networks. Appl Intell 53, 21676–21691 (2023). https://doi.org/10.1007/s10489-023-04639-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04639-9