Skip to main content
SpringerLink
Account
Menu
Find a journal Publish with us
Search
Cart
  1. Home
  2. Machine Intelligence Research
  3. Article

A Study of Using Synthetic Data for Effective Association Knowledge Learning

  • Research Article
  • Open access
  • Published: 08 March 2023
  • volume 20, pages 194–206 (2023)
Download PDF

You have full access to this open access article

Machine Intelligence Research Aims and scope Submit manuscript
A Study of Using Synthetic Data for Effective Association Knowledge Learning
Download PDF
  • Yuchi Liu  ORCID: orcid.org/0000-0001-9061-61801,
  • Zhongdao Wang  ORCID: orcid.org/0000-0002-4483-87832,
  • Xiangxin Zhou  ORCID: orcid.org/0000-0002-1526-05482 &
  • …
  • Liang Zheng  ORCID: orcid.org/0000-0002-1464-95001 
  • 295 Accesses

  • 25 Altmetric

  • 3 Mentions

  • Explore all metrics

Cite this article

Abstract

Association, aiming to link bounding boxes of the same identity in a video sequence, is a central component in multi-object tracking (MOT). To train association modules, e.g., parametric networks, real video data are usually used. However, annotating person tracks in consecutive video frames is expensive, and such real data, due to its inflexibility, offer us limited opportunities to evaluate the system performance w.r.t. changing tracking scenarios. In this paper, we study whether 3D synthetic data can replace real-world videos for association training. Specifically, we introduce a large-scale synthetic data engine named MOTX, where the motion characteristics of cameras and objects are manually configured to be similar to those of real-world datasets. We show that, compared with real data, association knowledge obtained from synthetic data can achieve very similar performance on real-world test sets without domain adaption techniques. Our intriguing observation is credited to two factors. First and foremost, 3D engines can well simulate motion factors such as camera movement, camera view, and object movement so that the simulated videos can provide association modules with effective motion features. Second, the experimental results show that the appearance domain gap hardly harms the learning of association knowledge. In addition, the strong customization ability of MOTX allows us to quantitatively assess the impact of motion factors on MOT, which brings new insights to the community.

Article PDF

Download to read the full article text

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  1. G. Brasó, L. Leal-Taixé. Learning a neural solver for multiple object tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6246–6256, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00628.

    Google Scholar 

  2. Y. H. Xu, A. Ŝep, Y. T. Ban, R. Horaud, L. Leal-Taixé, X. Alameda-Pineda. How to train your deep multi-object tracker. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6786–6795, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00682.

    Google Scholar 

  3. L. Leal-Taixé, A. Milan, I. Reid, S. Roth, K. Schindler. MOTChallenge 2015: Towards a benchmark for multi-target tracking. [Online], Available: https://arxiv.org/abs/1504.01942, 2015.

  4. A. Milan, L. Leal-Taixé, I. Reid, S. Roth, K. Schindler. MOT16: A benchmark for multi-object tracking. [Online], Available: https://arxiv.org/abs/1603.00831, 2016.

  5. S. Bąk, P. Carr, J. F. Lalonde. Domain adaptation through synthesis for unsupervised person re-identification. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 193–209, 2018. DOI: https://doi.org/10.1007/978-3-030-01261-8_12.

    Google Scholar 

  6. H. Z. Dou, W. H. Zhang, P. Z. Zhang, Y. H. Zhao, S. Y. Li, Z. Q. Qin, F. Wu, L. Dong, X. Li. VersatileGait: A large-scale synthetic gait dataset with fine-grained attributes and complicated scenarios. [Online], Available: https://arxiv.org/abs/2101.01394, 2021.

  7. Z. F. Xue, W. J. Mao, L. Zheng. Learning to simulate complex scenes. [Online], Available: https://arxiv.org/abs/2006.14611, 2020.

  8. Y. Yao, L. Zheng, X. D. Yang, M. Naphade, T. Gedeon. Simulating content consistent vehicle datasets with attribute descent. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 775–791, 2020. DOI: https://doi.org/10.1007/978-3-030-58539-6_46.

    Google Scholar 

  9. J. H. Li, X. Gao, T. T. Jiang. Graph networks for multiple object tracking. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Snowmass, USA, pp. 708–717, 2020. DOI: https://doi.org/10.1109/WACV45572.2020.9093347.

  10. Z. D. Wang, L. Zheng, Y. X. Liu, Y. L. Li, S. J. Wang. Towards real-time multi-object tracking. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 107–122, 2020. DOI: https://doi.org/10.1007/978-3-030-58621-8_7.

    Google Scholar 

  11. N. Wojke, A. Bewley, D. Paulus. Simple online and real-time tracking with a deep association metric. In Proceedings of IEEE International Conference on Image Processing, Beijing, China, pp. 3645–3649, 2017. DOI: https://doi.org/10.1109/ICIP.2017.8296962.

  12. Y. F. Zhan, C. Y. Wang, X. G. Wang, W. J. Zeng, W. Y. Liu. A simple baseline for multi-object tracking. [Online], Available: https://arxiv.org/abs/2004.01888v1, 2020.

  13. Z. W. Zhou, J. L. Xing, M. D. Zhang, W. M. Hu. Online multi-target tracking with tensor-based high-order graph matching. In Proceedings of the 24th International Conference on Pattern Recognition, IEEE, Beijing, China, pp. 1809–1814, 2018. DOI: https://doi.org/10.1109/ICPR.2018.8545450.

    Google Scholar 

  14. J. Zhu, H. Yang, N. Liu, M. Kim, W. J. Zhang, M. H. Yang. Online multi-object tracking with dual matching attention networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 379–396, 2018. DOI: https://doi.org/10.1007/978-3-030-01228-1_23.

    Google Scholar 

  15. Q. C. Wang, Y. H. Gong, C. H. Yang, C. H. Li. Robust object tracking under appearance change conditions. International Journal of Automation and Computing, vol. 7, no. 1, pp. 31–38, 2010. DOI: https://doi.org/10.1007/s11633-010-0031-9.

    Article  Google Scholar 

  16. H. W. Kuhn. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, vol. 2, no. 1–2, pp. 83–97, 1955. DOI: https://doi.org/10.1002/nav.3800020109.

    Article  MathSciNet  MATH  Google Scholar 

  17. G. Welch, G. Bishop. An Introduction to the Kalman Filter. University of North Carolina at Chapel Hill, Chapel Hill, USA, 1995.

    Google Scholar 

  18. I. Papakis, A. Sarkar, A. Karpatne. GCNNMatch: Graph convolutional neural networks for multi-object tracking via Sinkhorn normalization. [Online], Available: https://arxiv.org/abs/2010.00067, 2020.

  19. X. C. Peng, B. Usman, N. Kaushik, J. Hoffman, D. Q. Wang, K. Saenko. VisDA: The visual domain adaptation challenge. [Online], Available: https://arxiv.org/abs/1710.06924, 2017.

  20. X. C. Peng, B. Usman, N. Kaushik, D. Q. Wang, J. Hoffman, K. Saenko. VisDA: A synthetic-to-real benchmark for visual domain adaptation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Salt Lake City, USA, pp. 2021–2026, 2018. DOI: https://doi.org/10.1109/CVPRW.2018.00271.

    Google Scholar 

  21. Y. Cabon, N. Murray, M. Humenberger. Virtual KITTI 2. [Online], Available: https://arxiv.org/abs/2001.10773, 2020.

  22. A. Gaidon, Q. Wang, Y. Cabon, E. Vig. Virtual Worlds as proxy for multi-object tracking analysis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 4340–4349, 2016. DOI: https://doi.org/10.1109/CVPR.2016.470.

  23. Y. Z. Hou, L. Zheng, S. Gould. Multiview detection with feature perspective transformation. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 1–18, 2020. DOI: https://doi.org/10.1007/978-3-030-58571-6_1.

    Google Scholar 

  24. M. Fabbri, F. Lanzi, S. Calderara, A. Palazzi, R. Vezzani, R. Cucchiara. Learning to detect and track visible and occluded body joints in a virtual world. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 450–456, 2018. DOI: https://doi.org/10.1007/978-3-030-01225-0_27.

    Google Scholar 

  25. S. Sankaranarayanan, Y. Balaji, A. Jain, S. Nam Lim, R. Chellappa. Learning from synthetic data: Addressing domain shift for semantic segmentation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3752–3761, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00395.

    Google Scholar 

  26. C. Doersch, A. Zisserman. Sim2real transfer learning for 3D human pose estimation: Motion to the rescue. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, 2019.

  27. E. Kolve, R. Mottaghi, W. Han, E. VanderBilt, L. Weihs, A. Herrasti, M. Deitke, K. Ehsani, D. Gordon, Y. K. Zhu, A. Kembhavi, A. Gupta, A. Farhadi. AI2-THOR: An interactive 3D environment for visual AI. [Online], Available: https://arxiv.org/abs/1712.05474, 2017.

  28. A. Kar, A. Prakash, M. Y. Liu, E. Cameracci, J. Yuan, M. Rusiniak, D. Acuna, A. Torralba, S. Fidler. Meta-Sim: Learning to generate synthetic datasets. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Republic of Korea, pp. 4550–4559, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00465.

    Google Scholar 

  29. A. Juliani, V. P. Berges, E. Teng, A. Cohen, J. Harper, C. Elion, C. Goy, Y. Gao, H. Henry, M. Mattar, D. Lange. Unity: A general platform for intelligent agents. [Online], Available: https://arxiv.org/abs/1809.02627, 2018.

  30. X. X. Sun, L. Zheng. Dissecting person re-identification from the viewpoint of viewpoint. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 608–617, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00070.

    Google Scholar 

  31. F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini. The graph neural network model. IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009. DOI: https://doi.org/10.1109/TNN.2008.2005605.

    Article  Google Scholar 

  32. A. Bewley, Z. Y. Ge, L. Ott, F. Ramos, B. Upcroft. Simple online and realtime tracking. In Proceedings of IEEE International Conference on Image Processing, Phoenix, USA, pp. 3464–3468, 2016. DOI: https://doi.org/10.1109/ICIP.2016.7533003.

  33. Y. H. Du, Y. Song, B. Yang, Y. Y. Zhao. StrongSORT: Make deepSORT great again. [Online], Available: https://arxiv.org/abs/2202.13514, 2022.

  34. K. Bernardin, R. Stiefelhagen. Evaluating multiple object tracking performance: The clear mot metrics. EURASIP Journal on Image and Video Processing, vol. 2008, Article number 246309, 2008.

  35. P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, L. Leal-Taixé. MOT20: A benchmark for multi object tracking in crowded scenes. [Online], Available: https://arxiv.org/abs/2003.09003. 2020.

  36. W. J. Deng, L. Zheng, Q. X. Ye, G. L. Kang, Y. Yang, J. B. Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 994–1003, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00110.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the ARC Discovery Early Career Researcher Award, China (No. DE200101283) and the ARC Discovery Project, China (No. DP210102801).

Author information

Authors and Affiliations

  1. College of Engineering & Computer Science, Australian National University, Canberra, 2601, Australia

    Yuchi Liu & Liang Zheng

  2. Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China

    Zhongdao Wang & Xiangxin Zhou

Authors
  1. Yuchi Liu
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Zhongdao Wang
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Xiangxin Zhou
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Liang Zheng
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Zheng.

Additional information

Declarations

Competing interests. The authors have no competing interests to declare that are relevant to the content of this article.

Yuchi Liu received the B.Eng. degree in software engineering from Australian National University, Australia in 2018. He is currently a Ph. D. degree candidate in computer science at Australian National University, Australia.

His research interests include video object tracking, learning from synthetic data, and weakly supervised learning.

Zhongdao Wang received the B. Sc. degree in physics from Department of Physics Tsinghua University, China in 2017. He is currently a Ph. D. degree candidate in electronic engineering at Department of Electronic Engineering, Tsinghua University, China.

His research interests include perception algorithms for autonomous driving, including but not limited to 3D object detection/tracking, network architecture/learning algorithm/pre-training for multimodal fusion, and 4D Auto-labeling.

Xiangxin Zhou received the B. Sc. degree in electronic engineering from Department of Electronic Engineering, Tsinghua University, China in 2021. He is currently a Ph. D. degree candidate in artificial intelligence at School of Artificial Intelligence, University of Chinese Academy of Sciences (UCAS), China, and Institute of Automation, Chinese Academy of Sciences (CASIA), China.

His research interests include geometric deep learning, graph neural networks for drug design, causal inference, and multimodal machine learning.

Liang Zheng received the B. Eng. degree in life science from Tsinghua University, China in 2010, and the Ph. D. degree in electronic engineering from Tsinghua University, China in 2015. He is a lecturer and a computer science futures fellowship in School of Computer Science, Australian National University, Australia.

His research interests include computer vision, machine learning, object re-identification and dataset-centered vision.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Wang, Z., Zhou, X. et al. A Study of Using Synthetic Data for Effective Association Knowledge Learning. Mach. Intell. Res. 20, 194–206 (2023). https://doi.org/10.1007/s11633-022-1380-x

Download citation

  • Received: 19 July 2022

  • Accepted: 13 October 2022

  • Published: 08 March 2023

  • Issue Date: April 2023

  • DOI: https://doi.org/10.1007/s11633-022-1380-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Multi-object tracking (MOT)
  • data association
  • synthetic data
  • motion simulation
  • association knowledge learning
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement

search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

Not affiliated

Springer Nature

© 2023 Springer Nature