Skip to main content

MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking

  • Conference paper
  • First Online:
Image Analysis and Processing – ICIAP 2023 (ICIAP 2023)

Abstract

Multi-camera vehicle tracking (MCVT) aims to trace multiple vehicles among videos gathered from overlapping and non-overlapping city cameras. It is beneficial for city-scale traffic analysis and management as well as for security. However, developing MCVT systems is tricky, and their real-world applicability is dampened by the lack of data for training and testing computer vision deep learning-based solutions. Indeed, creating new annotated datasets is cumbersome as it requires great human effort and often has to face privacy concerns. To alleviate this problem, we introduce MC-GTA - Multi Camera Grand Tracking Auto, a synthetic collection of images gathered from the virtual world provided by the highly-realistic Grand Theft Auto 5 (GTA) video game. Our dataset has been recorded from several cameras recording urban scenes at various crossroads. The annotations, consisting of bounding boxes localizing the vehicles with associated unique IDs consistent across the video sources, have been automatically generated by interacting with the game engine. To assess this simulated scenario, we conduct a performance evaluation using an MCVT SOTA approach, showing that it can be a valuable benchmark that mitigates the need for real-world data. The MC-GTA dataset and the code for creating new ad-hoc custom scenarios are available at https://github.com/GaetanoV10/GT5-Vehicle-BB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/GaetanoV10/GT5-Vehicle-BB.

  2. 2.

    http://www.dev-c.com/gtav/scripthookv/.

  3. 3.

    https://github.com/royukira/AIC22_Track1_MTMC_ID10.

  4. 4.

    https://www.aicitychallenge.org/.

References

  1. Amato, G., Ciampi, L., Falchi, F., Gennaro, C., Messina, N.: Learning pedestrian detection from virtual worlds. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11751, pp. 302–312. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30642-7_27

    Chapter  Google Scholar 

  2. Benedetto, M.D., Carrara, F., Ciampi, L., Falchi, F., Gennaro, C., Amato, G.: An embedded toolset for human activity monitoring in critical environments. Expert Syst. Appl. 199, 117125 (2022). https://doi.org/10.1016/j.eswa.2022.117125

    Article  Google Scholar 

  3. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP). IEEE, September 2016. https://doi.org/10.1109/icip.2016.7533003

  4. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  5. Carrara, F., Pasco, L., Gennaro, C., Falchi, F.: Learning to detect fallen people in virtual worlds. In: International Conference on Content-based Multimedia Indexing. ACM, September 2022. https://doi.org/10.1145/3549555.3549573

  6. Ciampi, L., Messina, N., Falchi, F., Gennaro, C., Amato, G.: Virtual to real adaptation of pedestrian detectors. Sensors 20(18), 5250 (2020). https://doi.org/10.3390/s20185250

    Article  Google Scholar 

  7. Ciampi., L., Santiago., C., Costeira., J., Falchi., F., Gennaro., C., Amato., G.: Unsupervised domain adaptation for video violence detection in the wild. In: Proceedings of the 3rd International Conference on Image Processing and Vision Engineering - IMPROVE, pp. 37–46. INSTICC, SciTePress (2023). https://doi.org/10.5220/0011965300003497

  8. Ciampi, L., Santiago, C., Costeira, J., Gennaro, C., Amato, G.: Domain adaptation for traffic density estimation. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications (2021). https://doi.org/10.5220/0010303401850195

  9. Deschaud, J.: KITTI-CARLA: a kitti-like dataset generated by CARLA simulator. CoRR abs/2109.00892 (2021)

    Google Scholar 

  10. Dosovitskiy, A., Ros, G., Codevilla, F., López, A.M., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, November 13–15, 2017, Proceedings. Proceedings of Machine Learning Research, vol. 78, pp. 1–16. PMLR (2017)

    Google Scholar 

  11. Fabbri, M., Lanzi, F., Calderara, S., Palazzi, A., Vezzani, R., Cucchiara, R.: Learning to detect and track visible and occluded body joints in a virtual world. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 450–466. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_27

    Chapter  Google Scholar 

  12. Foszner, P., et al.: CrowdSim2: an open synthetic benchmark for object detectors. In: Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications (2023). https://doi.org/10.5220/0011692500003417

  13. Foszner, P., et al.: Development of a realistic crowd simulation environment for fine-grained validation of people tracking methods. In: Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications (2023). https://doi.org/10.5220/0011691500003417

  14. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding YOLO series in 2021. arXiv preprint arXiv:2107.08430 (2021)

  15. He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, pp. 2980–2988. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.322

  16. Jocher, G., et al.: ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation, November 2022. https://doi.org/10.5281/zenodo.7347926

  17. Kohl, P., Specker, A., Schumann, A., Beyerer, J.: The MTA dataset for multi target multi camera pedestrian tracking by weighted distance aggregation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, June 2020. https://doi.org/10.1109/cvprw50498.2020.00529

  18. Li, Y., Hilton, A., Illingworth, J.: Towards reliable real-time multiview tracking. In: Proceedings 2001 IEEE Workshop on Multi-Object Tracking. IEEE Computer Society. https://doi.org/10.1109/mot.2001.937980

  19. Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020). https://doi.org/10.1109/TPAMI.2018.2858826

    Article  Google Scholar 

  20. Liu, C., et al.: City-scale multi-camera vehicle tracking guided by crossroad zones. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, June 2021. https://doi.org/10.1109/cvprw53098.2021.00466

  21. Liu, H., Tian, Y., Wang, Y., Pang, L., Huang, T.: Deep relative distance learning: tell the difference between similar vehicles. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, June 2016. https://doi.org/10.1109/cvpr.2016.238

  22. Liu, X., Liu, W., Mei, T., Ma, H.: PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Trans. Multimed. 20(3), 645–658 (2018). https://doi.org/10.1109/tmm.2017.2751966

  23. Meinhardt, T., Kirillov, A., Leal-Taixe, L., Feichtenhofer, C.: TrackFormer: multi-object tracking with transformers. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, June 2022. https://doi.org/10.1109/cvpr52688.2022.00864

  24. Qian, Y., Yu, L., Liu, W., Hauptmann, A.G.: Electricity: an efficient multi-camera vehicle tracking system for intelligent city. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 588–589 (2020)

    Google Scholar 

  25. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  26. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/tpami.2016.2577031

  27. Staniszewski, M., et al.: Application of crowd simulations in the evaluation of tracking algorithms. Sensors. 20(17), 4960 (2020). https://doi.org/10.3390/s20174960

  28. Tan, X., et al.: Multi-camera vehicle tracking and re-identification based on visual and spatial-temporal features. In: CVPR Workshops, pp. 275–284 (2019)

    Google Scholar 

  29. Wang, C., Bochkovskiy, A., Liao, H.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. CoRR abs/2207.02696 (2022). arXiv:2207.02696

  30. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE, September 2017. https://doi.org/10.1109/icip.2017.8296962

  31. Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. LNCS, vol. 13682, pp. 1–21. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_1

  32. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)

  33. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: 9th International Conference on Learning Representations, ICLR 2021. OpenReview.net (2021)

    Google Scholar 

Download references

Acknowledgements

Supported by: MOST - Sustainable Mobility National Research Center, funded by the European Union Next-GenerationEU (Piano Nazionale di Ripresa E Resilienza (PNRR) - Missione 4 Componente 2, Investimento 1.4 - D.D. 1033 17/06/2022, CN00000023); AI4Media – A European Excellence Centre for Media, Society, and Democracy (EC, H2020 No. 951911); SUN – Social and hUman ceNtered XR (EC, Horizon Europe No. 101092612).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Ciampi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ciampi, L., Messina, N., Valenti, G.E., Amato, G., Falchi, F., Gennaro, C. (2023). MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing – ICIAP 2023. ICIAP 2023. Lecture Notes in Computer Science, vol 14233. Springer, Cham. https://doi.org/10.1007/978-3-031-43148-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43148-7_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43147-0

  • Online ISBN: 978-3-031-43148-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics