MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking

Ciampi, Luca; Messina, Nicola; Valenti, Gaetano Emanuele; Amato, Giuseppe; Falchi, Fabrizio; Gennaro, Claudio

doi:10.1007/978-3-031-43148-7_27

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14233))

Included in the following conference series:

International Conference on Image Analysis and Processing

594 Accesses

Abstract

Multi-camera vehicle tracking (MCVT) aims to trace multiple vehicles among videos gathered from overlapping and non-overlapping city cameras. It is beneficial for city-scale traffic analysis and management as well as for security. However, developing MCVT systems is tricky, and their real-world applicability is dampened by the lack of data for training and testing computer vision deep learning-based solutions. Indeed, creating new annotated datasets is cumbersome as it requires great human effort and often has to face privacy concerns. To alleviate this problem, we introduce MC-GTA - Multi Camera Grand Tracking Auto, a synthetic collection of images gathered from the virtual world provided by the highly-realistic Grand Theft Auto 5 (GTA) video game. Our dataset has been recorded from several cameras recording urban scenes at various crossroads. The annotations, consisting of bounding boxes localizing the vehicles with associated unique IDs consistent across the video sources, have been automatically generated by interacting with the game engine. To assess this simulated scenario, we conduct a performance evaluation using an MCVT SOTA approach, showing that it can be a valuable benchmark that mitigates the need for real-world data. The MC-GTA dataset and the code for creating new ad-hoc custom scenarios are available at https://github.com/GaetanoV10/GT5-Vehicle-BB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Classification and Tracking of Vehicles Using Videos Captured by Unmanned Aerial Vehicles

PakVehicle-ReID: a multi-perspective benchmark for vehicle re-identification in unconstrained urban road environment

Article 09 November 2023

Efficient City-Wide Multi-Class Multi-Movement Vehicle Counting: A Survey

Article 22 November 2020

Notes

References

Amato, G., Ciampi, L., Falchi, F., Gennaro, C., Messina, N.: Learning pedestrian detection from virtual worlds. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11751, pp. 302–312. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30642-7_27
Chapter Google Scholar
Benedetto, M.D., Carrara, F., Ciampi, L., Falchi, F., Gennaro, C., Amato, G.: An embedded toolset for human activity monitoring in critical environments. Expert Syst. Appl. 199, 117125 (2022). https://doi.org/10.1016/j.eswa.2022.117125
Article Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP). IEEE, September 2016. https://doi.org/10.1109/icip.2016.7533003
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Carrara, F., Pasco, L., Gennaro, C., Falchi, F.: Learning to detect fallen people in virtual worlds. In: International Conference on Content-based Multimedia Indexing. ACM, September 2022. https://doi.org/10.1145/3549555.3549573
Ciampi, L., Messina, N., Falchi, F., Gennaro, C., Amato, G.: Virtual to real adaptation of pedestrian detectors. Sensors 20(18), 5250 (2020). https://doi.org/10.3390/s20185250
Article Google Scholar
Ciampi., L., Santiago., C., Costeira., J., Falchi., F., Gennaro., C., Amato., G.: Unsupervised domain adaptation for video violence detection in the wild. In: Proceedings of the 3rd International Conference on Image Processing and Vision Engineering - IMPROVE, pp. 37–46. INSTICC, SciTePress (2023). https://doi.org/10.5220/0011965300003497
Ciampi, L., Santiago, C., Costeira, J., Gennaro, C., Amato, G.: Domain adaptation for traffic density estimation. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications (2021). https://doi.org/10.5220/0010303401850195
Deschaud, J.: KITTI-CARLA: a kitti-like dataset generated by CARLA simulator. CoRR abs/2109.00892 (2021)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., López, A.M., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, November 13–15, 2017, Proceedings. Proceedings of Machine Learning Research, vol. 78, pp. 1–16. PMLR (2017)
Google Scholar
Fabbri, M., Lanzi, F., Calderara, S., Palazzi, A., Vezzani, R., Cucchiara, R.: Learning to detect and track visible and occluded body joints in a virtual world. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 450–466. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_27
Chapter Google Scholar
Foszner, P., et al.: CrowdSim2: an open synthetic benchmark for object detectors. In: Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications (2023). https://doi.org/10.5220/0011692500003417
Foszner, P., et al.: Development of a realistic crowd simulation environment for fine-grained validation of people tracking methods. In: Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications (2023). https://doi.org/10.5220/0011691500003417
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding YOLO series in 2021. arXiv preprint arXiv:2107.08430 (2021)
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, pp. 2980–2988. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.322
Jocher, G., et al.: ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation, November 2022. https://doi.org/10.5281/zenodo.7347926
Kohl, P., Specker, A., Schumann, A., Beyerer, J.: The MTA dataset for multi target multi camera pedestrian tracking by weighted distance aggregation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, June 2020. https://doi.org/10.1109/cvprw50498.2020.00529
Li, Y., Hilton, A., Illingworth, J.: Towards reliable real-time multiview tracking. In: Proceedings 2001 IEEE Workshop on Multi-Object Tracking. IEEE Computer Society. https://doi.org/10.1109/mot.2001.937980
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020). https://doi.org/10.1109/TPAMI.2018.2858826
Article Google Scholar
Liu, C., et al.: City-scale multi-camera vehicle tracking guided by crossroad zones. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, June 2021. https://doi.org/10.1109/cvprw53098.2021.00466
Liu, H., Tian, Y., Wang, Y., Pang, L., Huang, T.: Deep relative distance learning: tell the difference between similar vehicles. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, June 2016. https://doi.org/10.1109/cvpr.2016.238
Liu, X., Liu, W., Mei, T., Ma, H.: PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Trans. Multimed. 20(3), 645–658 (2018). https://doi.org/10.1109/tmm.2017.2751966
Meinhardt, T., Kirillov, A., Leal-Taixe, L., Feichtenhofer, C.: TrackFormer: multi-object tracking with transformers. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, June 2022. https://doi.org/10.1109/cvpr52688.2022.00864
Qian, Y., Yu, L., Liu, W., Hauptmann, A.G.: Electricity: an efficient multi-camera vehicle tracking system for intelligent city. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 588–589 (2020)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/tpami.2016.2577031
Staniszewski, M., et al.: Application of crowd simulations in the evaluation of tracking algorithms. Sensors. 20(17), 4960 (2020). https://doi.org/10.3390/s20174960
Tan, X., et al.: Multi-camera vehicle tracking and re-identification based on visual and spatial-temporal features. In: CVPR Workshops, pp. 275–284 (2019)
Google Scholar
Wang, C., Bochkovskiy, A., Liao, H.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. CoRR abs/2207.02696 (2022). arXiv:2207.02696
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE, September 2017. https://doi.org/10.1109/icip.2017.8296962
Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. LNCS, vol. 13682, pp. 1–21. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_1
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: 9th International Conference on Learning Representations, ICLR 2021. OpenReview.net (2021)
Google Scholar

Download references

Acknowledgements

Supported by: MOST - Sustainable Mobility National Research Center, funded by the European Union Next-GenerationEU (Piano Nazionale di Ripresa E Resilienza (PNRR) - Missione 4 Componente 2, Investimento 1.4 - D.D. 1033 17/06/2022, CN00000023); AI4Media – A European Excellence Centre for Media, Society, and Democracy (EC, H2020 No. 951911); SUN – Social and hUman ceNtered XR (EC, Horizon Europe No. 101092612).

Author information

Authors and Affiliations

Institute of Information Science and Technologies, ISTI-CNR, Via G. Moruzzi 1, 56124, Pisa, Italy
Luca Ciampi, Nicola Messina, Giuseppe Amato, Fabrizio Falchi & Claudio Gennaro
Department of Information Engineering, University of Pisa, Via G. Caruso 16, 56122, Pisa, Italy
Gaetano Emanuele Valenti

Authors

Luca Ciampi
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Messina
View author publications
You can also search for this author in PubMed Google Scholar
Gaetano Emanuele Valenti
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Amato
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Falchi
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Gennaro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luca Ciampi .

Editor information

Editors and Affiliations

University of Udine, Udine, Italy
Gian Luca Foresti
University of Udine, Udine, Italy
Andrea Fusiello
University of York, York, UK
Edwin Hancock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ciampi, L., Messina, N., Valenti, G.E., Amato, G., Falchi, F., Gennaro, C. (2023). MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing – ICIAP 2023. ICIAP 2023. Lecture Notes in Computer Science, vol 14233. Springer, Cham. https://doi.org/10.1007/978-3-031-43148-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-43148-7_27
Published: 05 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43147-0
Online ISBN: 978-3-031-43148-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking

Abstract

Access this chapter

Similar content being viewed by others

Classification and Tracking of Vehicles Using Videos Captured by Unmanned Aerial Vehicles

PakVehicle-ReID: a multi-perspective benchmark for vehicle re-identification in unconstrained urban road environment

Efficient City-Wide Multi-Class Multi-Movement Vehicle Counting: A Survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking

Abstract

Access this chapter

Similar content being viewed by others

Classification and Tracking of Vehicles Using Videos Captured by Unmanned Aerial Vehicles

PakVehicle-ReID: a multi-perspective benchmark for vehicle re-identification in unconstrained urban road environment

Efficient City-Wide Multi-Class Multi-Movement Vehicle Counting: A Survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation