Cut-in maneuver detection with self-supervised contrastive video representation learning

Nalcakan, Yagiz; Bastanlar, Yalin

doi:10.1007/s11760-023-02512-3

Cut-in maneuver detection with self-supervised contrastive video representation learning

Original Paper
Published: 01 March 2023

Volume 17, pages 2915–2923, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Yagiz Nalcakan^1,2 &
Yalin Bastanlar¹

269 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The detection of the maneuvers of the surrounding vehicles is important for autonomous vehicles to act accordingly to avoid possible accidents. This study proposes a framework based on contrastive representation learning to detect potentially dangerous cut-in maneuvers that can happen in front of the ego vehicle. First, the encoder network is trained in a self-supervised fashion with contrastive loss where two augmented videos of the same video clip stay close to each other in the embedding space, while augmentations from different videos stay far apart. Since no maneuver labeling is required in this step, a relatively large dataset can be used. After this self-supervised training, the encoder is fine-tuned with our cut-in/lane-pass labeled datasets. Instead of using original video frames, we simplified the scene by highlighting surrounding vehicles and ego-lane. We have investigated the use of several classification heads, augmentation types, and scene simplification alternatives. The most successful model outperforms the best fully supervised model by \(\sim \)2% with an accuracy of 92.52%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Learning Approach to Analyze Traffic Congestions for Effective Traffic Management

Effective Semantic Video Classification Model for Driverless Car

Article 13 November 2023

Data Extraction from Traffic Videos Using Machine Learning Approach

Data availability

The labeled and unlabeled simplified scene representation data is publicly available at Github.

References

Insurance Information Institute, Facts + Statistics: Highway safety. Accessed: 2022-06-28
Deo, N., Rangesh, A., Trivedi, M.M.: How would surround vehicles move? a unified framework for maneuver classification and motion prediction. IEEE Trans. Intell. Veh. 3(2), 129–140 (2018)
Article Google Scholar
Jeong, Y., Yi, K.: Bidirectional long short-term memory-based interactive motion prediction of cut-in vehicles in urban environments. IEEE Access 8, 106183–106197 (2020)
Article Google Scholar
Chen, Y., Hu, C., Wang, J.: Human-centered trajectory tracking control for autonomous vehicles with driver cut-in behavior prediction. IEEE Trans. Veh. Technol. 68(9), 8461–8471 (2019)
Article Google Scholar
Yoon, Y., Kim, C., Lee, J., Yi, K.: Interaction-aware probabilistic trajectory prediction of cut-in vehicles using gaussian process for proactive control of autonomous vehicles. IEEE Access 9, 63440–63455 (2021)
Article Google Scholar
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In: CVPR (2020)
Izquierdo, R., Quintanar, A., Parra, I., Fernández-Llorca, D., Sotelo, M.: The prevention dataset: A novel benchmark for prediction of vehicles intentions. In: ITSC (2019)
Altché, F., de La Fortelle, A.: An lstm network for highway trajectory prediction. In: ITSC (2017)
Scheel, O., Nagaraja, N.S., Schwarz, L., Navab, N., Tombari, F.: Attention-based lane change prediction. In: ICRA (2019)
Brosowsky, M., Orschau, P., Dünkel, O., Elspas, P., Slieter, D., Zöllner, M.: Joint vehicle trajectory and cut-in prediction on highways using output constrained neural networks. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI) (2021)
Izquierdo, R., Quintanar, A., Parra, I., Fernández-Llorca, D., Sotelo, M.: Experimental validation of lane-change intention prediction methodologies based on CNN and LSTM. In: ITSC (2019)
Biparva, M., Fernández-Llorca, D., Izquierdo-Gonzalo, R., Tsotsos, J.K.: Video action recognition for lane-change classification and prediction of surrounding vehicles. Preprint at arXiv:2101.05043 (2021)
Fernández-Llorca, D., Biparva, M., Izquierdo-Gonzalo, R., Tsotsos, J.K.: Two-stream networks for lane-change prediction of surrounding vehicles. In: ITSC (2020)
Qian, R., Meng, T., Gong, B., Yang, M.-H., Wang, H., Belongie, S., Cui, Y.: Spatiotemporal contrastive video representation learning. In: CVPR (2021)
Bastanlar, Y., Orhan, S.: Self-supervised contrastive representation learning in computer vision. In: Artificial Intelligence - Annual Volume 2022, IntechOpen. (2022)
Le-Khac, P.H., Healy, G., Smeaton, A.F.: Contrastive representation learning: a framework and review. Ieee Access 8, 193907–193934 (2020)
Article Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: ICML (2020)
Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P.H., Buchatskaya, E., Doersch, C., Pires, B.A., Guo, Z.D., Azar, M.G., Piot, B., Kavukcuoglu, K., Munos, R., Valko, M.: Bootstrap your own latent: a new approach to self-supervised learning. Adv. Neural. Inf. Process. Syst. 33, 21271–21284 (2020)
Google Scholar
Chen, X., He, K.: Exploring simple siamese representation learning. In: CVPR (2021)
Tao, L., Wang, X., Yamasaki, T.: Self-supervised video representation learning using inter-intra contrastive framework. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2193–2201 (2020)
Han, T., Xie, W., Zisserman, A.: Self-supervised co-training for video representation learning. Adv. Neural. Inf. Process. Syst. 33, 5679–5690 (2020)
Google Scholar
Lin, Y., Guo, X., Lu, Y.: Self-supervised video representation learning with meta-contrastive network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8239–8249 (2021)
Knights, J., Harwood, B., Ward, D., Vanderkop, A., Mackenzie-Ross, O., Moghadam, P.: Temporally coherent embeddings for self-supervised video representation learning. In: 2020 25th International Conference on Pattern Recognition (ICPR), IEEE. pp. 8914–8921 (2021)
Wu, Y., Kirillov, A., Massa, F., Lo, W.-Y., Girshick, R.: Detectron2. https://github.com/facebookresearch/detectron2 (2019)
Oord, A.v.d., Li, Y., Vinyals, O.: Representation Learning with Contrastive Predictive Coding. Preprint at arXiv:1807.03748 (2018)
Stamoulakatos, A., Cardona, J., Michie, C., Andonovic, I., Lazaridis, P., Bellekens, X., Atkinson, R., Hossain, M.M., Tachtatzis, C.: A comparison of the performance of 2d and 3d convolutional neural networks for subsea survey video classification. In: OCEANS 2021: San Diego–Porto, pp. 1–10 (2021)
Hara, K., Kataoka, H., Satoh, Y.: Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet? In: CVPR (2018)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

Download references

Funding

Y.Nalcakan is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) 2244 Scholarship, Grant No: 2244-118C079. The numerical calculations reported in this paper were fully performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources).

Author information

Authors and Affiliations

Computer Engineering, İzmir Institute of Technology, 35430, Urla, İzmir, Türkiye
Yagiz Nalcakan & Yalin Bastanlar
TTTech Auto Turkey, 35260, İzmir, Türkiye
Yagiz Nalcakan

Authors

Yagiz Nalcakan
View author publications
You can also search for this author in PubMed Google Scholar
Yalin Bastanlar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.Nalcakan prepared all of the data sets and machine learning method codes and performed the experiments. Both authors wrote the manuscript, prepared the figures, and designed the detailed steps of the work. We confirm that the manuscript has been read and approved by both authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.

Corresponding author

Correspondence to Yagiz Nalcakan.

Ethics declarations

Conflict of interest

We wish to confirm that there are no known conflicts of interest associated with this publication.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nalcakan, Y., Bastanlar, Y. Cut-in maneuver detection with self-supervised contrastive video representation learning. SIViP 17, 2915–2923 (2023). https://doi.org/10.1007/s11760-023-02512-3

Download citation

Received: 21 October 2022
Revised: 28 December 2022
Accepted: 24 January 2023
Published: 01 March 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11760-023-02512-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cut-in maneuver detection with self-supervised contrastive video representation learning

Abstract

Access this article

Similar content being viewed by others

A Deep Learning Approach to Analyze Traffic Congestions for Effective Traffic Management

Effective Semantic Video Classification Model for Driverless Car

Data Extraction from Traffic Videos Using Machine Learning Approach

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cut-in maneuver detection with self-supervised contrastive video representation learning

Abstract

Access this article

Similar content being viewed by others

A Deep Learning Approach to Analyze Traffic Congestions for Effective Traffic Management

Effective Semantic Video Classification Model for Driverless Car

Data Extraction from Traffic Videos Using Machine Learning Approach

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation