Skip to main content

The Tenth Visual Object Tracking VOT2022 Challenge Results

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13808))

Included in the following conference series:

Abstract

The Visual Object Tracking challenge VOT2022 is the tenth annual tracker benchmarking activity organized by the VOT initiative. Results of 93 entries are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in recent years. The VOT2022 challenge was composed of seven sub-challenges focusing on different tracking domains: (i) VOT-STs2022 challenge focused on short-term tracking in RGB by segmentation, (ii) VOT-STb2022 challenge focused on short-term tracking in RGB by bounding boxes, (iii) VOT-RTs2022 challenge focused on “real-time” short-term tracking in RGB by segmentation, (iv) VOT-RTb2022 challenge focused on “real-time” short-term tracking in RGB by bounding boxes, (v) VOT-LT2022 focused on long-term tracking, namely coping with target disappearance and reappearance, (vi) VOT-RGBD2022 challenge focused on short-term tracking in RGB and depth imagery, and (vii) VOT-D2022 challenge focused on short-term tracking in depth-only imagery. New datasets were introduced in VOT-LT2022 and VOT-RGBD2022, VOT-ST2022 dataset was refreshed, and a training dataset was introduced for VOT-LT2022. The source code for most of the trackers, the datasets, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.votchallenge.net/vot2022/participation.html.

  2. 2.

    https://www.votchallenge.net/vot2022.

  3. 3.

    The target was sought in a window centered at its estimated position in the previous frame. This is the simplest dynamic model that assumes all positions within a search region containing the target have an equal prior probability.

References

  1. Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6182–6191 (2019)

    Google Scholar 

  2. Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8126–8135 (2021)

    Google Scholar 

  3. Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022)

    Google Scholar 

  4. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  5. Kristan, M., et. al.: Appendix of the tenth visual object tracking vot2022 challenge results. In: European Conference on Computer Vision ECCV2022 Workshops (2022)

    Google Scholar 

  6. Kristan, M., et al.: The seventh visual object tracking vot2019 challenge results. In: ICCV2019 Workshops, Workshop on Visual Object Tracking Challenge (2019)

    Google Scholar 

  7. Kristan, M., et al.: The eighth visual object tracking VOT2020 challenge results. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12539, pp. 547–601. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-68238-5_39

    Chapter  Google Scholar 

  8. Kristan, M., et al.: The visual object tracking vot2018 challenge results. In: ECCV2018 Workshops, Workshop on Visual Object Tracking Challenge (2018)

    Google Scholar 

  9. Kristan, M., et al.: The visual object tracking vot2017 challenge results. In: ICCV2017 Workshops, Workshop on Visual Object Tracking Challenge (2017)

    Google Scholar 

  10. Kristan, M., et al.: The visual object tracking VOT2016 challenge results. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 777–823. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_54

    Chapter  Google Scholar 

  11. Kristan, M., et. al.: The ninth visual object tracking vot2021 challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision ICCV2021 Workshops, Workshop On Visual Object Tracking Challenge, pp. 2711–2738 (2021)

    Google Scholar 

  12. Kristan, M., et al.: The visual object tracking vot2015 challenge results. In: ICCV2015 Workshops, Workshop on Visual Object Tracking Challenge (2015)

    Google Scholar 

  13. Kristan, M., et al.: The visual object tracking vot2013 challenge results. In: ICCV2013 Workshops, Workshop on Visual Object Tracking Challenge, pp. 98–111 (2013)

    Google Scholar 

  14. Kristan, M.: the visual object tracking VOT2014 challenge results. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 191–217. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16181-5_14

    Chapter  Google Scholar 

  15. Lukežič, A., Kart, U., Kämäräinen, J., Matas, J., Kristan, M.: CDTB: A Color and Depth Visual Object Tracking Dataset and Benchmark. In: ICCV (2019)

    Google Scholar 

  16. Lukežič, A., Čehovin Zajc, L., Vojír̃, T., Matas, J., Kristan, M.: Sperformance evaluation methodology for long-term single object tracking. IEEE Trans. Cybern. (2020)

    Google Scholar 

  17. Lukežič, A., Matas, J., Kristan, M.: A discriminative single-shot segmentation network for visual object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021). https://doi.org/10.1109/TPAMI.2021.3137933

  18. Mayer, C., Danelljan, M., Paudel, D.P., Van Gool, L.: Learning target candidate association to keep track of what not to track. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13444–13454 (2021)

    Google Scholar 

  19. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)

    Google Scholar 

  20. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877 (2020)

  21. Čehovin, L.: TraX: the visual tracking exchange protocol and library. Neurocomputing (2017). https://doi.org/10.1016/j.neucom.2017.02.036

    Article  Google Scholar 

  22. Wu, Y., Lim, J., Yang, M.H.: Online object tracking: A benchmark. Comp. Vis. Patt. Recogn. (2013)

    Google Scholar 

  23. Yan, B., Peng, H., Fu, J., Wang, D., Lu, H.: Learning spatio-temporal transformer for visual tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 10448–10457 (2021)

    Google Scholar 

  24. Yan, B., Zhang, X., Wang, D., Lu, H., Yang, X.: Alpha-refine: Boosting tracking performance by precise bounding box estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5289–5298 (2021)

    Google Scholar 

  25. Yan, S., Yang, J., Käpylä, J., Zheng, F., Leonardis, A., Kämäräinen, J.K.: DepthTrack: Unveiling the power of RGBD tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10725–10733 (2021)

    Google Scholar 

  26. Yan, S., Yang, J., Leonardis, A., Kämäräinen, J.K.: Depth-only object tracking. In: British Machine Vision Conference (BMVC) (2021)

    Google Scholar 

  27. Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: Point set representation for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9657–9666 (October 2019)

    Google Scholar 

  28. Yang, Z., Miao, J., Wang, X., Wei, Y., Yang, Y.: Associating objects with scalable transformers for video object segmentation. arXiv preprint arXiv:2203.11442 (2022)

  29. Ye, B., Chang, H., Ma, B., Shan, S.: Joint feature learning and relation modeling for tracking: A one-stream framework. arXiv preprint arXiv:2203.11991 (2022)

Download references

Acknowledgements

This work was supported in part by the following research programs and projects: Slovenian research agency research program P2-0214 and project J2-2506. The challenge was sponsored by the Faculty of Computer Science, University of Ljubljana, Slovenia. This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP), in particular in terms of the Wallenberg research arena for Media and Language, and the Berzelius cluster at NSC, both funded by the Knut and Alice Wallenberg Foundation, as well as by ELLIIT, a strategic research environment funded by the Swedish government. Besides, this work was partially supported by the Fundamental Research Funds for the Central Universities (No. 226-2022-00051). This work has also received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 899987. Hyung Jin Chang and Aleš Leonardis were supported by the Institute of Information and communications Technology Planning and evaluation (IITP) grant funded by the Korea government (MSIT) (2021-0-00537). Gustavo Fernández was supported by the AIT Strategic Research Program 2022 Visual Surveillance and Insight.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleš Leonardis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kristan, M. et al. (2023). The Tenth Visual Object Tracking VOT2022 Challenge Results. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13808. Springer, Cham. https://doi.org/10.1007/978-3-031-25085-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25085-9_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25084-2

  • Online ISBN: 978-3-031-25085-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics