Advertisement

Traffic Accident Benchmark for Causality Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12352)

Abstract

We propose a brand new benchmark for analyzing causality in traffic accident videos by decomposing an accident into a pair of events, cause and effect. We collect videos containing traffic accident scenes and annotate cause and effect events for each accident with their temporal intervals and semantic labels; such annotations are not available in existing datasets for accident anticipation task. Our dataset has the following two advantages over the existing ones, which would facilitate practical research for causality analysis. First, the decomposition of an accident into cause and effect events provides atomic cues for reasoning on a complex environment and planning future actions. Second, the prediction of cause and effect in an accident makes a system more interpretable to humans, which mitigates the ambiguity of legal liabilities among agents engaged in the accident. Using the proposed dataset, we analyze accidents by localizing the temporal intervals of their causes and effects and classifying the semantic labels of the accidents. The dataset as well as the implementations of baseline models are available in the code repository (https://github.com/tackgeun/CausalityInTrafficAccident).

Notes

Acknowledgement

This work was supported by Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) [2017-0-01780, 2017-0-01779] and Microsoft Research Asia. We also appreciate Jonghwan Mun and Ilchae Jung for valuable discussion.

References

  1. 1.
    National automotive sampling system (NASS) general estimates system (GES) analytical user’s manual, pp. 1988–2004 (2005). https://one.nhtsa.gov/Data/National-Automotive-Sampling-System-(NASS)
  2. 2.
    Aliakbarian, M.S., Saleh, F.S., Salzmann, M., Fernando, B., Petersson, L., Andersson, L.: VIENA\(^2\): a driving anticipation dataset. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 449–466. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-20887-5_28
  3. 3.
    Buch, S., Escorcia, V., Shen, C., Ghanem, B., Niebles, J.C.: SST: Single-stream temporal action proposals. In: CVPR (2017)Google Scholar
  4. 4.
    Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: CVPR (2017)Google Scholar
  5. 5.
    Chan, F.-H., Chen, Y.-T., Xiang, Y., Sun, M.: Anticipating accidents in dashcam videos. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 136–153. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-54190-7_9CrossRefGoogle Scholar
  6. 6.
    Chao, Y.W., Vijayanarasimhan, S., Seybold, B., Ross, D.A., Deng, J., Sukthankar, R.: Rethinking the faster R-CNN architecture for temporal action localization. In: CVPR (2018)Google Scholar
  7. 7.
    Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)Google Scholar
  8. 8.
    Farha, Y.A., Gall, J.: Ms-tcn: Multi-stage temporal convolutional network for action segmentation. In: CVPR (2019)Google Scholar
  9. 9.
    Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR (2016)Google Scholar
  10. 10.
    Hara, K., Kataoka, H., Satoh, Y.: Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and imagenet? In: CVPR, pp. 6546–6555 (2018)Google Scholar
  11. 11.
    Herzig, R., et al.: Spatio-temporal action graph networks. In: ICCVW (2019)Google Scholar
  12. 12.
    Kataoka, H., Suzuki, T., Oikawa, S., Matsui, Y., Satoh, Y.: Drive video analysis for the detection of traffic near-miss incidents. In: ICRA (2018)Google Scholar
  13. 13.
    Kim, H., Lee, K., Hwang, G., Suh, C.: Crash to not Crash: learn to identify dangerous vehicles using a simulator. In: AAAI (2019)Google Scholar
  14. 14.
    Lea, C., Flynn, M.D., Vidal, R., Reiter, A., Hager, G.D.: Temporal convolutional networks for action segmentation and detection. In: CVPR (2017)Google Scholar
  15. 15.
    Lebeda, K., Hadfield, S., Bowden, R.: Exploring causal relationships in visual object tracking. In: ICCV (2015)Google Scholar
  16. 16.
    Lopez-Paz, D., Nishihara, R., Chintala, S., Schölkopf, B., Bottou, L.: Discovering causal signals in images. In: CVPR (2017)Google Scholar
  17. 17.
    Najm, W.G., Smith, J.D., Yanagisawa, M.: Pre-crash scenario typology for crash avoidance research (2007). https://rosap.ntl.bts.gov/view/dot/6281
  18. 18.
    Pickup, L.C., et al.: Seeing the arrow of time. In: CVPR (2014)Google Scholar
  19. 19.
    Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS (2014)Google Scholar
  20. 20.
    Suzuki, T., Kataoka, H., Aoki, Y., Satoh, Y.: Anticipating traffic accidents with adaptive loss and large-scale incident db. In: CVPR (2018)Google Scholar
  21. 21.
    Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: ICCV (2015)Google Scholar
  22. 22.
    Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_2CrossRefGoogle Scholar
  23. 23.
    Wei, D., Lim, J., Zisserman, A., Freeman, W.T.: Learning and using the arrow of time. In: CVPR (2018)Google Scholar
  24. 24.
    Xu, H., Das, A., Saenko, K.: R-c3d: region convolutional 3D network for temporal activity detection. In: ICCV (2017)Google Scholar
  25. 25.
    Yao, Y., Xu, M., Wang, Y., Crandall, D.J., Atkins, E.M.: Unsupervised traffic accident detection in first-person videos. In: IROS (2019)Google Scholar
  26. 26.
    Zeng, K.H., Chou, S.H., Chan, F.H., Niebles, J.C., Sun, M.: Agent-centric risk assessment: accident anticipation and risky region localization. In: CVPR (2017)Google Scholar
  27. 27.
    Zeng, R., et al.: Graph convolutional networks for temporal action localization. In: ICCV (2019)Google Scholar
  28. 28.
    Zhao, Y., Xiong, Y., Wang, L., Wu, Z., Tang, X., Lin, D.: Temporal action detection with structured segment networks. In: ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of CSEPOSTECHPohangKorea
  2. 2.Department of ECE and ASRISeoul National UniversitySeoulKorea

Personalised recommendations